lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAHk-=wjf-8ko=w7SyGWRLZ5bL_iwgw8mky8cxvYzF3xHDHoCMQ@mail.gmail.com>
Date: Fri, 7 Jun 2024 12:15:05 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Rasmus Villemoes <linux@...musvillemoes.dk>
Cc: Josh Poimboeuf <jpoimboe@...nel.org>, Peter Zijlstra <peterz@...radead.org>, 
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: objtool query: section start/end symbols?

On Fri, 7 Jun 2024 at 02:52, Rasmus Villemoes <linux@...musvillemoes.dk> wrote:
>
> FWIW, I did a POC some years ago but either never managed to send it, or
> never got a response. It did boot in virtme and I managed to get gdb to
> do disassembly to show that the dentry hash lookup did become a 'shift
> immediate'.
>
> https://github.com/Villemoes/linux/tree/rai

Looks conceptually very similar to what I do, except I literally
_only_ rewrite the constant itself in the instruction stream.

You end up using these replacement sequences, which certainly works,
but limits your instruction scheduling a bit (ie the minimal size ends
up being a full branch to the thunk.

I started out wanting to literally replace a single 8-bit constant in
a shift-instruction that might be smaller than the jump.

That said, you then made your approach work just fine by just
combining the shift with the address load, so it's not a single small
instruction that gets replaced any more.

And honestly, I think your approach may be better than mine.

The "replace constant in one instruction" approach works fine in my
tests, and gcc in particular seems to actually take advantage of the
instruction scheduling freedom (clang less so).

But your thunking approach would probably be much easier on
architectures like arm64 where the "load a constant" thing can be a
lot less convenient than one single contiguous value in memory.

Would you be willing to resurrect your thing for a modern kernel? I'll
certainly try it out next to mine?

                    Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ