linux-kernel - Re: [PATCH 4/7] arm64: add 'runtime constant' support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZmiN_7LMp2fbKhIw@J2N7QTR9R3>
Date: Tue, 11 Jun 2024 18:48:47 +0100
From: Mark Rutland <mark.rutland@....com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Peter Anvin <hpa@...or.com>, Ingo Molnar <mingo@...nel.org>,
	Borislav Petkov <bp@...en8.de>,
	Thomas Gleixner <tglx@...utronix.de>,
	Rasmus Villemoes <linux@...musvillemoes.dk>,
	Josh Poimboeuf <jpoimboe@...nel.org>,
	Catalin Marinas <catalin.marinas@....com>,
	Will Deacon <will@...nel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	the arch/x86 maintainers <x86@...nel.org>,
	linux-arm-kernel@...ts.infradead.org,
	linux-arch <linux-arch@...r.kernel.org>
Subject: Re: [PATCH 4/7] arm64: add 'runtime constant' support

On Tue, Jun 11, 2024 at 09:56:17AM -0700, Linus Torvalds wrote:
> On Tue, 11 Jun 2024 at 07:29, Mark Rutland <mark.rutland@....com> wrote:
> >
> > Do we expect to use this more widely? If this only really matters for
> > d_hash() it might be better to handle this via the alternatives
> > framework with callbacks and avoid the need for new infrastructure.
> 
> Hmm. The notion of a callback for alternatives is intriguing and would
> be very generic, but we don't have anything like that right now.
> 
> Is anybody willing to implement something like that? Because while I
> like the idea, it sounds like a much bigger change.

Fair enough if that's a pain on x86, but we already have them on arm64, and
hence using them is a smaller change there. We already have a couple of cases
which uses MOVZ;MOVK;MOVK;MOVK sequence, e.g.

	// in __invalidate_icache_max_range()
        asm volatile(ALTERNATIVE_CB("movz %0, #0\n"
                                    "movk %0, #0, lsl #16\n"
                                    "movk %0, #0, lsl #32\n"
                                    "movk %0, #0, lsl #48\n",
                                    ARM64_ALWAYS_SYSTEM,
                                    kvm_compute_final_ctr_el0)
                     : "=r" (ctr));

... which is patched via the callback:

	void kvm_compute_final_ctr_el0(struct alt_instr *alt,
				       __le32 *origptr, __le32 *updptr, int nr_inst)
	{
		generate_mov_q(read_sanitised_ftr_reg(SYS_CTR_EL0),
			       origptr, updptr, nr_inst);
	}       

... where the generate_mov_q() helper does the actual instruction generation.

So if we only care about a few specific constants, we could give them their own
callbacks, like kvm_compute_final_ctr_el0() above.

[...]

> > We have some helpers for instruction manipulation, and we can use
> > aarch64_insn_encode_immediate() here, e.g.
> >
> > #include <asm/insn.h>
> >
> > static inline void __runtime_fixup_16(__le32 *p, unsigned int val)
> > {
> >         u32 insn = le32_to_cpu(*p);
> >         insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_16, insn, val);
> >         *p = cpu_to_le32(insn);
> > }
> 
> Ugh. I did that, and then noticed that it makes the generated code
> about ten times bigger.
> 
> That interface looks positively broken.
> 
> There is absolutely nobody who actually wants a dynamic argument, so
> it would have made both the callers and the implementation *much*
> simpler had the "AARCH64_INSN_IMM_16" been encoded in the function
> name the way I did it for my instruction rewriting.
>
> It would have made the use of it simpler, it would have avoided all
> the "switch (type)" garbage, and it would have made it all generate
> much better code.

Oh, completely agreed. FWIW, I have better versions sat in my
arm64/insn/rework branch, but I haven't had the time to get all the rest
of the insn framework cleanup sorted:

  https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=arm64/insn/rework&id=9cf0ec088c9d5324c60933bf3924176fea0a4d0b

I can go prioritise getting that bit out if it'd help, or I can clean
this up later.

Those allow the compiler to do much better, including compile-time (or
runtime) checks that immediates fit. For example:

	void encode_imm16(__le32 *p, u16 imm)
	{
		u32 insn = le32_to_cpu(*p);

		// Would warn if 'imm' were u32.
		// As u16 always fits, no warning
		BUILD_BUG_ON(!aarch64_insn_try_encode_unsigned_imm16(&insn, imm));

		*p = cpu_to_le32(insn);
	}

... compiles to:

	<encode_imm16>:
	       ldr     w2, [x0]
	       bfi     w2, w1, #5, #16
	       str     w2, [x0]
	       ret

... which I think is what you want?

> So I did that change you suggested, and then undid it again.
> 
> Because that whole aarch64_insn_encode_immediate() thing is an
> abomination, and should be burned at the stake.  It's misdesigned in
> the *worst* possible way.
> 
> And no, this code isn't performance-critical, but I have some taste,
> and the code I write will not be using that garbage.

Fair enough.

Mark.