lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240829205436.GA14562@brightrain.aerifal.cx>
Date: Thu, 29 Aug 2024 16:54:38 -0400
From: Rich Felker <dalias@...c.org>
To: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Cc: linux-api@...r.kernel.org, libc-alpha@...rceware.org,
	musl@...ts.openwall.com
Subject: AT_MINSIGSTKSZ mismatched interpretation kernel vs libc

As I understand it, the AT_MINSIGSTKSZ auxv value is supposed to be a
suitable runtime value for MINSIGSTKSZ (sysconf(_SC_MINSIGSTKSZ)),
such that it's safe to pass as a size to sigaltstack. However, this is
not how the kernel actually implements it. At least on x86 and
powerpc, the kernel fills it via get_sigframe_size, which computes the
size of the sigcontext/siginfo/etc to be pushed and uses that
directly, without allowing any space for actual execution, and without
ensuring the value is at least as large as the legacy constant
MINSIGSTKSZ. This leads to two problems:

1. If userspace uses the value without clamping it not-below
   MINSIGSTKSZ, sigaltstack will fail with ENOMEM.

2. If the kernel needs more space than MINSIGSTKSZ just for the signal
   frame structures, userspace that trusts AT_MINSIGSTKSZ will only
   allocate enough for the frame, and the program will immediately
   crash/stack-overflow once execution passes to userspace.

Since existing kernels in the wild can't be fixed, and since it looks
like the problem is just that the kernel chose a poor definition of
AT_MINSIGSTKSZ, I think userspace (glibc, musl, etc.) need to work
around the problem, adding a per-arch correction term to
AT_MINSIGSTKSZ that's basically equal to:

    legacy_MINSIGSTKSZ - AT_MINSIGSTKSZ as returned on legacy hw

such that adding the correction term would reproduce the expected
value MINSIGSTKSZ.

The only question is whether the kernel will commit to keeping this
behavior, or whether it would be "fixed" to include all the needed
working space when they eventually decide they want bigger stacks for
some new register file bloat. I think keeping the current behavior, so
we can just add a fixed offset, is probably the best thing to do.

Rich

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ