[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAHk-=wiJiDSPZJTV7z3Q-u4DfLgQTNWqUqqrwSBHp0+Dh016FA@mail.gmail.com>
Date: Thu, 6 Nov 2025 16:48:33 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Vitaly Kuznetsov <vkuznets@...hat.com>, Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, Borislav Petkov <bp@...en8.de>
Subject: Re: [PATCH] KVM: x86: Use "checked" versions of get_user() and put_user()
On Thu, 6 Nov 2025 at 13:02, Sean Christopherson <seanjc@...gle.com> wrote:
>
> Use the normal, checked versions for get_user() and put_user() instead of
> the double-underscore versions that omit range checks, as the checked
> versions are actually measurably faster on modern CPUs (12%+ on Intel,
> 25%+ on AMD).
Thanks. I'm assuming I'll see this from the regular kvm pull at some point.
We have a number of other cases of this in x86 signal handling, and
those probably should also be just replaced with plain get_user()
calls.
The x86 FPU context handling in particular is disgusting, and doesn't
have access_ok() close to the actual accesses. The access_ok() is in
copy_fpstate_to_sigframe(), while the __get_user() calls are in a
different file entirely.
That's almost certainly also a pessimization, in *addition* to being
an unreadable mess with security implications if anybody ever gets
that code wrong. So I really think that should be fixed.
The perf events core similarly has some odd checking. For a moment I
thought it used __get_user() as a way to do both user and kernel
frames, but no, it actually has an alias for access_ok(), except it
calls it "valid_user_frame()" and for some reason uses "__access_ok()"
which lacks the compiler "likely()" marking.
Anyway, every single __get_user() call I looked at looked like
historical garbage.
Another example of complete horror: the PCI code uses
__get_user/__put_user in the /proc handling code.
Which didn't even make sense historically, when the actual data read
or written is then used with the pci_user_read/write_config_xyz()
functions.
I suspect it may go back to some *really* old code when the PCI writes
were also done as just raw inline asm, and while that has not been the
case for decades, the user accesses remained because they still
worked. That code predates not just git, but the BK tree too.
End result: I get the feeling that we should just do a global
search-and-replace of the __get_user/__put_user users, replace them
with plain get_user/put_user instead, and then fix up any fallout (eg
the coco code).
Because unlike the "start checking __{get,put}_user() addresses", such
a global search-and-replace could then be reverted one case at a time
as people notice "that was one of those horror-cases that actually
*wanted* to work with kernel addresses too".
Clearly it's much too late to do that for 6.18, but if somebody
reminds me during the 6.19 merge window, I think I'll do exactly that.
Or even better - some brave heroic soul that wants to deal with the
fallout do this in a branch for linux-next?
Linus
Powered by blists - more mailing lists