lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 15 Nov 2021 20:39:22 +0000
From:   Nick Terrell <terrelln@...com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
CC:     Guenter Roeck <linux@...ck-us.net>,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        Nick Terrell <nickrterrell@...il.com>,
        "Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>
Subject: Re: Linux 5.16-rc1



> On Nov 15, 2021, at 9:53 AM, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> 
> On Mon, Nov 15, 2021 at 9:07 AM Guenter Roeck <linux@...ck-us.net> wrote:
>> 
>> Top of tree is a bit better:
> 
> Thanks for re-testing.
> 
> That doesn't actually look all that bad for -rc1.  Several of them
> already have fixes, and most of the rest look "easily fixable".
> 
> Famous last words.
> 
> The most worrisome ones are probably the stack frame complaint ones
> (libzstd and a couple of powerpc ones) that Geert also reported, but
> they might at least to some degree be as simple as just due to the
> same excessive inlining that was already fingered for the code bloat.
> 
> But it could be more fundamental - the kernel just doesn't like stack
> allocations the same way user space does, so the sync-up to a newer
> libzstd might be a bit more problematic than just "don't force
> inlining".

On x86-64 I’ve measured zstd’s stack usage to be 1.6KB for compression,
this is up from 1.4KB before the change. I suspect it is a problem with these
functions on this compiler + architecture combo, where the compiler isn’t
able to inline + constant propagate + run dead code elimination. The functions
mentioned rely on these optimizations to be efficient, and I suspect if the
optimizations fail there will be a lot of unnecessary stack usage.

The solution should be to remove the dependency on compiler optimizations
for efficient stack usage in these functions. So we don’t end up with excess
stack usage on non-x86/arm architectures.

On my todo list is:

1. Reduce stack usage of the mentioned functions
2. Reduce code size bloat of lib/zstd/zstd_opt.c

I’m working on this now, and expect to have a pull request ready to go
tomorrow.

> Nick - you've been cc'd twice because you sign off your commits with
> your work email, but then seem to actually prefer the personal one, so
> I didn't know which to use and just added both. See

Sorry for the confusion. Both work, but I prefer my work email.

>  https://lore.kernel.org/lkml/652edea7-28a0-70d9-c63f-d910b5942454@roeck-us.net/
>  https://lore.kernel.org/lkml/20211115155105.3797527-1-geert@linux-m68k.org
> 
> if you didn't already.
> 
>               Linus

Best,
Nick Terrell


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ