linux-kernel - Re: riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <zbifzd2fa3epywbhfznfr24mmacsndkwlengq3cqcgply5it5v@33zajtzvjetj>
Date: Sun, 25 May 2025 13:36:16 -0400
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: David Laight <david.laight.linux@...il.com>
Cc: Arnd Bergmann <arnd@...db.de>, 
	Naresh Kamboju <naresh.kamboju@...aro.org>, linux-bcache@...r.kernel.org, 
	open list <linux-kernel@...r.kernel.org>, lkft-triage@...ts.linaro.org, 
	Linux Regressions <regressions@...ts.linux.dev>, Dan Carpenter <dan.carpenter@...aro.org>, 
	Anders Roxell <anders.roxell@...aro.org>, Steven Rostedt <rostedt@...dmis.org>
Subject: Re: riscv gcc-13 allyesconfig error the frame size of 2064 bytes is
 larger than 2048 bytes [-Werror=frame-larger-than=]

+cc Steve

On Sun, May 25, 2025 at 06:18:42PM +0100, David Laight wrote:
> On Fri, 23 May 2025 20:01:33 +0200
> "Arnd Bergmann" <arnd@...db.de> wrote:
> 
> > On Fri, May 23, 2025, at 19:11, Kent Overstreet wrote:
> > > On Fri, May 23, 2025 at 05:17:15PM +0200, Arnd Bergmann wrote:  
> > >> 
> > >> - KASAN_STACK adds extra redzones for each variable
> > >> - KASAN_STACK further prevents stack slots from getting
> > >>   reused inside one function, in order to better pinpoint
> > >>   which instance caused problems like out-of-scope access
> > >> - passing structures by value causes them to be put on
> > >>   the stack on some architectures, even when the structure
> > >>   size is only one or two registers  
> > >
> > > We mainly do this with bkey_s_c, which is just two words: on x86_64,
> > > that gets passed in registers. Is riscv different?  
> > 
> > Not sure, I think it's mostly older ABIs that are limited,
> > either not passing structures in registers at all, or only
> > possibly one but not two of them.
> > 
> > >> - sanitizers turn off optimizations that lead to better
> > >>   stack usage
> > >> - in some cases, the missed optimization ends up causing
> > >>   local variables to get spilled to the stack many times
> > >>   because of a combination of all the above.  
> > >
> > > Yeesh.
> > >
> > > I suspect we should be running with a larger stack when the sanitizers
> > > are running, and perhaps tweak the warnings accordingly. I did a bunch
> > > of stack usage work after I found a kmsan build was blowing out the
> > > stack, but then running with max stack usage tracing enabled showed it
> > > to be a largely non issue on non-sanitizer builds, IIRC.  
> > 
> > Enabling KASAN does double the available stack space. However, I don't
> > think we should use that as an excuse to raise the per-function
> > warning limit, because
> > 
> >  - the majority of all function stacks do not grow that much when
> >    sanitizers are enabled
> >  - allmodconfig enables KASAN and should still catch mistakes
> >    where a driver accidentally puts a large structure on the stack
> 
> That is rather annoying when you want to look at the generated code :-(
> 
> >  - 2KB on 64-bit targes is a really large limit. At some point
> >    in the past I had a series that lowered the limit to 1536 byte
> >    for 64-bit targets, but I never managed to get all the changes
> >    merged.
> 
> I've a cunning plan to do a proper static analysis of stack usage.
> It is a 'simple' matter of getting objtool to output all calls with
> the stack offset.
> Indirect calls need the function hashes from fine-ibt, but also need
> clang to support 'hash seeds' to disambiguate all the void (*)(void *)
> functions.
> That'll first barf at all recursion, and then, I expect, show a massive
> stack use inside snprintf() in some error path.

I suspect recursion will make the results you get with that approach
useless.

We already have "trace max stack", but that only checks at process exit,
so it doesn't tell you much.

We could do better with tracing - just inject a trampoline that checks
the current stack usage against the maximum stack usage we've seen, and
emits a trace event with a stack trace if it's greater.

(and now Steve's going to tell us he's already done this :)