lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 07 Oct 2022 10:28:02 +0200
From:   "Arnd Bergmann" <arnd@...db.de>
To:     "Nick Desaulniers" <ndesaulniers@...gle.com>
Cc:     linux-fsdevel@...r.kernel.org,
        "Alexander Viro" <viro@...iv.linux.org.uk>,
        "Andrew Morton" <akpm@...ux-foundation.org>,
        "Andi Kleen" <ak@...ux.intel.com>,
        "Christoph Hellwig" <hch@....de>,
        "Eric Dumazet" <edumazet@...gle.com>,
        "Darrick J. Wong" <darrick.wong@...cle.com>,
        "Greg Kroah-Hartman" <gregkh@...uxfoundation.org>,
        linux-kernel@...r.kernel.org, llvm@...ts.linux.dev
Subject: Re: [PATCH] fs/select: avoid clang stack usage warning

On Fri, Oct 7, 2022, at 12:21 AM, Nick Desaulniers wrote:
> On Thu, Mar 07, 2019 at 10:01:36AM +0100, Arnd Bergmann wrote:
>> The select() implementation is carefully tuned to put a sensible amount
>> of data on the stack for holding a copy of the user space fd_set,
>> but not too large to risk overflowing the kernel stack.
>> 
>> When building a 32-bit kernel with clang, we need a little more space
>> than with gcc, which often triggers a warning:
>> 
>> fs/select.c:619:5: error: stack frame size of 1048 bytes in function 'core_sys_select' [-Werror,-Wframe-larger-than=]
>> int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
>> 
>> I experimentally found that for 32-bit ARM, reducing the maximum
>> stack usage by 64 bytes keeps us reliably under the warning limit
>> again.
>> 
>> Signed-off-by: Arnd Bergmann <arnd@...db.de>
>> ---
>>  include/linux/poll.h | 4 ++++
>>  1 file changed, 4 insertions(+)
>> 
>> diff --git a/include/linux/poll.h b/include/linux/poll.h
>> index 7e0fdcf905d2..1cdc32b1f1b0 100644
>> --- a/include/linux/poll.h
>> +++ b/include/linux/poll.h
>> @@ -16,7 +16,11 @@
>>  extern struct ctl_table epoll_table[]; /* for sysctl */
>>  /* ~832 bytes of stack space used max in sys_select/sys_poll before allocating
>>     additional memory. */
>> +#ifdef __clang__
>> +#define MAX_STACK_ALLOC 768
>
> Hi Arnd,
> Upon a toolchain upgrade for Android, our 32b x86 image used for
> first-party developer VMs started tripping -Wframe-larger-than= again
> (thanks -Werror) which is blocking our ability to upgrade our toolchain.
>
> I've attached the zstd compressed .config file that reproduces with ToT
> LLVM:
>
> $ cd linux
> $ zstd -d path/to/config.zst -o .config
> $ make ARCH=i386 LLVM=1 -j128 fs/select.o
> fs/select.c:625:5: error: stack frame size (1028) exceeds limit (1024)
> in 'core_sys_select' [-Werror,-Wframe-larger-than]
> int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
>     ^
>
> As you can see, we're just barely tipping over the limit.  Should I send
> a patch to reduce this again? If so, any thoughts by how much?
> Decrementing the current value by 4 builds the config in question, but
> seems brittle.
>
> Do we need to only do this if !CONFIG_64BIT?
> commit ad312f95d41c ("fs/select: avoid clang stack usage warning")
> seems to allude to this being more problematic on 32b targets?

I think we should keep the limit consistent between 32 bit and 64 bit
kernels. Lowering the allocation a bit more would of course have a
performance impact for users that are just below the current limit,
so I think it would be best to first look at what might be going
wrong in the compiler.

I managed to reproduce the issue and had a look at what happens
here. A few random observations:

- the kernel is built with -fsanitize=local-bounds, dropping this
  option reduces the stack allocation for this function by around
  100 bytes, which would be the easiest change for you to build
  those kernels again without any source changes, but it may also
  be possible to change the core_sys_select function in a way that
  avoids the insertion of runtime bounds checks.

- If I mark 'do_select' as noinline_for_stack, the reported frame
  size is decreased a lot and is suddenly independent of
  -fsanitize=local-bounds:
  fs/select.c:625:5: error: stack frame size (336) exceeds limit (100) in 'core_sys_select' [-Werror,-Wframe-larger-than]
int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
  fs/select.c:479:21: error: stack frame size (684) exceeds limit (100) in 'do_select' [-Werror,-Wframe-larger-than]
static noinline int do_select(int n, fd_set_bits *fds, struct timespec64 *end_time)
  However, I don't even see how this makes sense at all, given that
  the actual frame size should be at least SELECT_STACK_ALLOC!

- The behavior of -ftrivial-auto-var-init= is a bit odd here: with =zero or
  =pattern, the stack usage is just below the limit (1020), without the
  option it is up to 1044. It looks like your .config picks =zero, which
  was dropped in the latest clang version, so it falls back to not
  initializing. Setting it to =pattern should give you the old
  behavior, but I don't understand why clang uses more stack without
  the initialization, rather than using less, as it would likely cause
  fewer spills

       Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ