lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=widPe38fUNjUOmX11ByDckaeEo9tN4Eiyke9u1SAtu9sA@mail.gmail.com>
Date: Tue, 11 Jun 2024 14:08:32 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Mark Rutland <mark.rutland@....com>
Cc: Peter Anvin <hpa@...or.com>, Ingo Molnar <mingo@...nel.org>, Borislav Petkov <bp@...en8.de>, 
	Thomas Gleixner <tglx@...utronix.de>, Rasmus Villemoes <linux@...musvillemoes.dk>, 
	Josh Poimboeuf <jpoimboe@...nel.org>, Catalin Marinas <catalin.marinas@....com>, 
	Will Deacon <will@...nel.org>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, 
	"the arch/x86 maintainers" <x86@...nel.org>, linux-arm-kernel@...ts.infradead.org, 
	linux-arch <linux-arch@...r.kernel.org>
Subject: Re: [PATCH 4/7] arm64: add 'runtime constant' support

On Tue, 11 Jun 2024 at 13:22, Mark Rutland <mark.rutland@....com> wrote:
>
> On arm64 we have early ("boot") and late ("system-wide") alternatives.
> We apply the system-wide alternatives in apply_alternatives_all(), a few
> callees deep under smp_cpus_done(), after secondary CPUs are brought up,
> since that has to handle mismatched features in big.LITTLE systems.

Annoyingly, we don't have any generic model for this. Maybe that would
be a good thing regardless, but your point that you have big.LITTLE
issues does kind of reinforce the thing that different architectures
have different requirements for the alternatives patching.

On arm64, the late alternatives seem to be in

  kernel_init() ->
    kernel_init_freeable() ->
      smp_init() ->
        smp_cpus_done() ->
          setup_system_features() ->
            setup_system_capabilities() ->
              apply_alternatives_all()

which is nice and late - that's when the system is fully initialized,
and kernel_init() is already running as the first real thread.

On x86, the alternatives are finalized much earlier in

  start_kernel() ->
    arch_cpu_finalize_init ->
      alternative_instructions()

which is quite early, much closer to the early arm64 case.

Now, even that early x86 timing is good enough for vfs_caches_early(),
which is also done from start_kernel() fairly early on - and before
the arch_cpu_finalize_init() code is run.

But ...

> I had assumed that we could use late/system-wide alternatives here, since
> those get applied after vfs_caches_init_early(), but maybe that's too
> late?

So vfs_caches_init_early() is *one* case for the dcache init, but for
the NUMA case, we delay the dcache init until after the MM setup has
been completed, and do it relatively later in the init sequence at
vfs_caches_init().

See that horribly named 'hashdist' variable ('dist' is not 'distance',
it's 'distribute'). It's not dcache-specific, btw. There's a couple of
other hashes that do that whole "NUMA distribution or not" thing..

Annoying, yes. I'm not sure that the dual init makes any actual sense
- I think it's entirely a historical oddity.

But that "done conditionally in two different places" may be ugly, but
even if we fixed it, we'd fix it by doing it in just once, and it
would be that later "NUMA has been initialized" vfs_caches_init()
case.

Which is too late for the x86 alternatives.

The arm64 late case would seem to work fine. It's late enough to be
after all "core kernel init", but still early enough to be before the
"generic" initcalls that will start initializing filesystems etc (that
then need the vfs code to have been initialized).

So that "smp_init()" placement that arm64 has is actually a very good
place for at least the dcache case. It's just not what x86 does.

Note that my "just replace the constants" model avoids all the
ordering issues because it just does the constant initialization
synchronously when the constant is initialized.

So it doesn't depend on any other ordering at all, and there is no
worry about subtle differences in when alternatives are applied, or
when the uses happen.

(It obviously does have the same ordering requirement that the
variable initialization itself has: the dcache init itself has to
happen before any dcache use, but that's neither surprising nor a new
ordering imposed by the runtime constant case).

There's an advantage to just being self-sufficient and not tying into
random other subsystems that have random other constraints.

              Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ