linux-kernel - Re: [PATCH] x86: add 'runtime constant' infrastructure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c380a87d-a5b2-4782-8bb0-2a10ed4fb9ad@zytor.com>
Date: Tue, 11 Jun 2024 11:22:31 -0700
From: "H. Peter Anvin" <hpa@...or.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Borislav Petkov <bp@...en8.de>, Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
        Rasmus Villemoes <linux@...musvillemoes.dk>,
        Josh Poimboeuf <jpoimboe@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        the arch/x86 maintainers <x86@...nel.org>,
        linux-arch <linux-arch@...r.kernel.org>
Subject: Re: [PATCH] x86: add 'runtime constant' infrastructure

On 6/10/24 18:24, Linus Torvalds wrote:
> On Mon, 10 Jun 2024 at 18:09, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
>>
>> Doing it in general is actually very very painful. Feel free to try -
>> but I can almost guarantee that you will throw out the "Keep It Simple
>> Stupid" approach and your patch will be twice the size if you do some
>> "rewrite the whole instruction" stuff.
>>
>> I really think there's a fundamental advantage to keeping things simple.
> 
> I guess the KISS approach would be to have a debug mode that just adds
> an 'int3' instruction *after* the constant. And then the constant
> rewriting rewrites the constant and just changes the 'int3' into the
> standard single-byte 'nop' instruction.
> 
> That wouldn't be complicated, and the cost would be minimal. But I
> don't see it being worth it, at least not for the current use where
> the unrewritten constant will just cause an oops on use.
> 

A general enough way to do it would be to put an int $3 and replace it 
with a ds: dummy prefix.

The issue there is "current use". I'm really, really worried about 
someone in the future putting this where it won't get properly patched 
and then all hell will be breaking loose.

Perhaps a better idea would be to provide the initial value as part of 
the declaration, so that the value is never "uninitialized" (much like a 
static variable can be initialized at compile time)?

In other words:

runtime_const_ptr(sym,init)

Unfortunately gas doesn't seem to properly implement the {nooptimize} 
prefix that is documented. This does require some gentle assembly hacking:

- Loading a pointer/long requires using the "movabs" mnemonic on x86-64. 
Combining that with

  (but not on x86-32 as there are no compacted forms of mov immediate; 
on x86-32 it is also legit to allow =rm rather than =r, but for an 8-bit 
immediate "=qm" has to be used.)

A size/type-generic version (one nice thing here is that init also ends 
up specifying the type):

#define _RUNTIME_CONST(where, sym, size) 				\
	"\n\t"							\
	".pushsection \"runtime_const_" #sym "\",\"a\"\n\t"	\
	".long (" #where ") - (" #size ") - .\n\t"		\
	".popsection"

extern void __noreturn __runtime_const_bad_size(void);

#define runtime_const(_sym, _init) ({ 				\
	typeof(_init) __ret; 					\
	const size_t __sz = sizeof(__ret); 			\
	switch (__sz) { 					\
	case 1:							\
		asm_inline("mov %[init],%[ret]\n1:"		\
		    _RUNTIME_CONST(1b, _sym, 1)			\
		    : [ret] "=qm" (__ret)			\
		    : [init] "i" (_init));			\
		break;						\
	case 8:							\
		asm_inline("movabs %[init],%[ret]\n1:"		\
		    _RUNTIME_CONST(1b, _sym, 8)			\
		    : [ret] "=r" (__ret)			\
		    : [init] "i" (_init));			\
		break;						\
	default:						\
		asm_inline("mov %[init],%[ret]\n1:"		\
		    _RUNTIME_CONST(1b, _sym, %c[sz])		\
		    : [ret] "=rm" (__ret)			\
		    : [init] "i" (_init), [sz] "n" (__sz)));	\
		break;						\
	}							\
	__ret; })


- For a shift count, it is unfortunately necessary to explicitly stitch 
together the instruction using .insn to avoid truncating the case where 
the operand is 1.

Size- and operand-generic version:

#define _runtime_const_shift(_val, _sym, _init, _op2) ({ 	\
	typeof(_val) __ret = (_val); 				\
	switch (sizeof(__ret)) {				\
	case 1:							\
		asm_inline(".insn 0xc0/%c[op2],%[ret],%[init]{:u8}\n1:" \
			_RUNTIME_CONST(1b, _sym, 1)		\
			: [ret] "+qm" (__ret)			\
			: [init] "i" ((u8)(_init)),		\
		 	  [op2] "n" (_op2));			\
		break; 						\
	default:						\
		asm_inline(".insn 0xc1/%c[op2],%[ret],%[init]{:u8}\n1:" \
			_RUNTIME_CONST(1b, _sym, 1)		\
			: [ret] "+rm" (__ret)			\
			: [init] "i" ((u8)(_init)),		\
		  	  [op2] "n" (_op2));			\
		break;						\
	}							\
	__ret; })						\

#define runtime_const_rol(v,s,i) _runtime_const_shift(v,s,i,0)
#define runtime_const_ror(v,s,i) _runtime_const_shift(v,s,i,1)
#define runtime_const_shl(v,s,i) _runtime_const_shift(v,s,i,4)
#define runtime_const_shr(v,s,i) _runtime_const_shift(v,s,i,5)
#define runtime_const_sar(v,s,i) _runtime_const_shift(v,s,i,7)

I am not sure if I'm missing something, but there really isn't a reason 
to use different section names for the shift counts specifically, is there?

NOTE: if we are *not* making these type-generic there is no reason 
whatsoever to not make these inlines...

	***


Also: one thing I would *really* like to see this being used for is 
cr4_pinned_bits, in which case one can, indeed, safely use a zero value 
at init time as long as cr4_pinned_mask is patched at the same time, 
which very much goes along with the above.

Again, this is a slightly less minimal versus what I had which was a 
maximal solution.

My approach would pretty much have targeted doing this for nearly all 
instances, which I eventually felt called for compiler support (or C++!) 
as adding a bunch of static_imm_*() macros all over the kernel really 
felt unpleasant.

	-hpa