[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGudoHGWL6gLjmo3m6uCt9ueHL9rGCdw_jz9FLvgu_3=3A-BrA@mail.gmail.com>
Date: Thu, 6 Nov 2025 13:06:06 +0100
From: Mateusz Guzik <mjguzik@...il.com>
To: Borislav Petkov <bp@...en8.de>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
"the arch/x86 maintainers" <x86@...nel.org>, brauner@...nel.org, viro@...iv.linux.org.uk, jack@...e.cz,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
tglx@...utronix.de, pfalcato@...e.de
Subject: Re: [PATCH 1/3] x86: fix access_ok() and valid_user_address() using
wrong USER_PTR_MAX in modules
On Thu, Nov 6, 2025 at 12:14 PM Borislav Petkov <bp@...en8.de> wrote:
>
> On Wed, Nov 05, 2025 at 09:50:51PM +0100, Mateusz Guzik wrote:
> > For unrelated reasons I disassembled kmem_cache_free and the following
> > goodies popped up:
> > sub 0x18e033f(%rip),%rax # ffffffff82f944d0 <page_offset_base>
> > [..]
> > add 0x18e031d(%rip),%rax # ffffffff82f944c0 <vmemmap_base>
> > [..]
> > mov 0x2189e19(%rip),%rax # ffffffff8383e010 <__pi_phys_base>
> >
> > These are definitely worthwhile to get rid of.
>
> Says which semi-respectable benchmark?
>
> If none, why bother?
>
I don't know what are you trying to say here.
Are you protesting the notion that reducing cache footprint of the
memory allocator is a good idea, or perhaps are you claiming these
vars are too problematic to warrant the effort, or something else?
I'll note that contrary to popular belief the Linux kernel is very
much *slow* in terms of single-threaded performance and it is not
about mitigations or hardening measures. There are tidbits of heavy
microoptimization here and there, but that's all paired with massive
perf loss few instructions later -- inlined rep movsq/stosq for small
sizes (gcc is at fault here), lock-prefixed instructions when they can
be avoided, but also cache-cold memory accesses which don't need to be
there and so on.
One great example of slowness is the SLUB allocator with its
cmpxchg16b-using fast paths, but that got recently damage-controlled
with introduction of "shaves". Even then, it still leaves performance
on the table.
I don't know if you consider this semi-respectable or better, but
years back Ingo Molnar created a simple benchmark for i-cache
footprint: https://lkml.org/lkml/2015/5/19/1009
I have been using a modified version of it on and off to optimize
FreeBSD and through systemic removal of tons of avoidable work
(including memory references which did not need to be there) I got to
single-threaded performance beating Linux. It's not that anything
clever is taking place there (in fact there is still plenty of room
for improvement), rather Linux has systemic issues where it loses on
performance when it does not have to.
All that said, will you notice not taking a cache miss in there in the
sea of other cache misses and other slowdows which are currently
present? I don't think so, but it does not invalidate the notion that
they should be eliminated if feasible.
I feel compelled to note runtime-consting of USER_PTR_MAX came in with
no benchmark results (semi-respectable or otherwise) and still
received no pushback despite a bug being uncovered related to it. Per
the above, I think runtime-consting of the thing makes perfect sense
and does not warrant benchmarking. Like I said, I'm not sure what you
were trying to state. If your position is that a benchmark is required
to remove a memory reference from a frequently used codepath, then you
should be protesting USER_PTR_MAX.
Powered by blists - more mailing lists