[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGudoHFwa3Wa-SYHJ7VVHoQMoB2ZkZqzq=Pth5xeovSGON_YHA@mail.gmail.com>
Date: Thu, 3 Apr 2025 01:39:09 +0200
From: Mateusz Guzik <mjguzik@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Andrew Cooper <andrew.cooper3@...rix.com>, linux-kernel@...r.kernel.org,
mingo@...hat.com, x86@...nel.org,
"Peter Zijlstra (Intel)" <peterz@...radead.org>
Subject: Re: [RFC PATCH] x86: prevent gcc from emitting rep movsq/stosq for
inlined ops
On Wed, Apr 2, 2025 at 8:56 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
> Admittedly my main use case for that was the big fixed-size case in
> the stat code, which just does
>
> return copy_to_user(statbuf,&tmp,sizeof(tmp)) ? -EFAULT : 0;
>
> and shows up like a sore thumb on some benchmarks.
>
> But I've always hated the patch because it turns out that the real fix
> is to get rid of the temporary buffer on the stack, and just copy the
> fields one-by-one with 'unsafe_put_user()' and friends, but every time
> I do that - and I've done it several times - I end up throwing the
> patch away because it ends up being problematic on non-x86
> architectures
>
> (Reason: INIT_STRUCT_STAT64_PADDING(). Ugh).
Well, one could do ugliness as follows, opt-in for interested uarchs,
pseudocode-wise:
__stat_fill_padding_start(user);
unsafe_put_user(user->somefield0, stat->somefield0);
__stat_fill_padding0(user);
unsafe_put_user(user->somefield1, stat->somefield1);
__stat_fill_padding1(user);
.....
__stat_fill_padding_end(user);
Where archs would provide padding macros as needed, expanding to
nothing or zeroing something. I don't know if everyone has fields in
the same order, it may be the stores would not be clean here. I guess
the two archs which really matter are amd64 and arm64? Archs not
interested still get copy_to_user.
Anyhow, for stat specifically, I think there is a different way out
and I suspect it will be faster.
Things start with memsetting struct kstat and populating some of it,
and only some of the populated stuff is used to begin with. There is
some getattr calls to further fill out kstat. Finally the kstack thing
gets converted into userspace stat and copied out.
While you can coalesce kstat -> stat conversion and copyout into one
thing, there is still the cost paid for doing work the syscall is not
looking at.
Instead the vfs layer could fill out an on-stack stat buffer
immediately and copy that out. This should be less work (no kstat
memset for example) and not require anyone to mess with
unsafe_put_user.
I don't know if this is worth pursuing, maybe I'll hack it up out of curiosity.
--
Mateusz Guzik <mjguzik gmail.com>
Powered by blists - more mailing lists