lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGudoHG1_r1B0pz6-HUqb6AfbAgWHxBy+TnimvQtwLLqkKtchA@mail.gmail.com>
Date:   Mon, 4 Sep 2023 01:15:10 +0200
From:   Mateusz Guzik <mjguzik@...il.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org,
        linux-arch@...r.kernel.org, bp@...en8.de
Subject: Re: [PATCH v2] x86: bring back rep movsq for user access on CPUs
 without ERMS

On 9/4/23, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> On Sun, 3 Sept 2023 at 14:48, Ingo Molnar <mingo@...nel.org> wrote:
>>
>> If measurements support it then this looks like a nice optimization.
>
> Well, it seems to work, but when I profile it to see if the end result
> looks reasonable, the profile data is swamped by the return
> mispredicts from CPU errata workarounds, and to a smaller degree by
> the clac/stac overhead of SMAP.
>
> So it does seem to work - at least it boots here and everything looks
> normal - and it does seem to generate good code, but the profiles look
> kind of sad.
>
> I also note that we do a lot of stupid pointless 'statx' work that is
> then entirely thrown away for a regular stat() system call.
>
> Part of it is actual extra work to set the statx fields.
>
> But a lot of it is that even if we didn't do that, the 'statx' code
> has made 'struct kstat' much bigger, and made our code footprints much
> worse.
>
> Of course, even without the useless statx overhead, 'struct kstat'
> itself ends up having a lot of padding because of how 'struct
> timespec64' looks. It might actually be good to split it explicitly
> into seconds and nanoseconds just for padding.
>
> Because that all blows 'struct kstat' up to 160 bytes here.
>
> And to make it all worse, the statx code has caused all the
> filesystems to have their own 'getattr()' code just to fill in that
> worthless garbage, when it used to be that you could rely on
> 'generic_fillattr()'.
>
> I'm looking at ext4_getattr(), for example, and I think *all* of it is
> due to statx - that to a close approximation nobody cares about, and
> is a specialty system call for a couple of users
>
> And again - the indirect branches have gone from being "a cycle or
> two" to being pipeline stalls and mispredicts. So not using just a
> plain 'generic_fillattr()' is *expensive*.
>
> Sad. Because the *normal* stat() family of system calls are some of
> the most important ones out there. Very much unlike statx().
>

First a note that the patch has a side-effect of dodging hardened
usercopy, but I'm testing on a kernel without it.

bad news: virtually no difference for stat(2) (aka newfstatat)
before:  3898139
after: 3922714 (+0%)

ok news: a modest win for the actual fstat system call (aka newfstat,
*not* the libc wrapper calling newfstatat)
before: 8511283
after: 8746829 (+2%)

That is rather disappointing, but I would check it in for the time
when glibc fixes fstat, on some conditions.

The patch as proposed is kind of crap -- moving all that handling out
of fs/stat.c is quite weird and trivially avoidable, with a way to do
it already present in the file.

Following the current example of INIT_STRUCT_STAT_PADDING you could
add these to an x86-64 header:
#define INIT_STRUCT_STAT_PAD0 unsafe_put_user(0,
&ubuf->__pad0,          Efault);
#define INIT_STRUCT_STAT_UNUSED { unsafe_put_user(0, ....);
unsafe_put_user(0, ....) }

the new func placed in fs/stat.c would get the same code gen as it
does with your patch, but other archs could easily join and there
would be no stat logic escaping the file (or getting duplicated again
if another arch wants to dodge the temp buffer)

Then I think it is a perfectly tolerable thing to do, but again given
the modest win it is perhaps too early.

Someone(tm) should probably sort out stat performance as is. ;)

-- 
Mateusz Guzik <mjguzik gmail.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ