linux-kernel - Re: [PATCH 0/8] stackleak: fixes and rework

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YmfLe+UZ85LhshZx@FVFF77S0Q05N>
Date:   Tue, 26 Apr 2022 11:37:47 +0100
From:   Mark Rutland <mark.rutland@....com>
To:     Kees Cook <keescook@...omium.org>
Cc:     linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        akpm@...ux-foundation.org, alex.popov@...ux.com,
        catalin.marinas@....com, luto@...nel.org, will@...nel.org
Subject: Re: [PATCH 0/8] stackleak: fixes and rework

On Tue, Apr 26, 2022 at 11:10:52AM +0100, Mark Rutland wrote:
> On Mon, Apr 25, 2022 at 03:54:00PM -0700, Kees Cook wrote:
> > On Mon, Apr 25, 2022 at 12:55:55PM +0100, Mark Rutland wrote:
> > > This series reworks the stackleak code. The first patch fixes some
> > > latent issues on arm64, and the subsequent patches improve the code to
> > > improve clarity and permit better code generation.
> > 
> > This looks nice; thanks! I'll put this through build testing and get it
> > applied shortly...
> 
> Thanks!
> 
> Patch 1 is liable to conflict with come other stacktrace bits that may go in
> for v5.19, so it'd be good if either that could be queued as a fix for
> v5.1-rc4, or we'll have to figure out how to deal with conflicts later.
> 
> > > While the improvement is small, I think the improvement to clarity and
> > > code generation is a win regardless.
> > 
> > Agreed. I also want to manually inspect the resulting memory just to
> > make sure things didn't accidentally regress. There's also an LKDTM test
> > for basic functionality.
> 
> I assume that's the STACKLEAK_ERASING test?
> 
> I gave that a spin, but on arm64 that test is flaky even on baseline v5.18-rc1.
> On x86_64 it seems consistent after 100s of runs. I'll go dig into that now. 

I hacked in some debug, and it looks like the sp used in the test is far above
the current lowest_sp. The test is slightly wrong since it grabs the address of
a local variable rather than using current_stack_pointer, but the offset I see
is much larger:

# echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT 
[   27.665221] lkdtm: Performing direct entry STACKLEAK_ERASING
[   27.665986] lkdtm: FAIL: lowest_stack 0xffff8000083a39e0 is lower than test sp 0xffff8000083a3c80
[   27.667530] lkdtm: FAIL: the thread stack is NOT properly erased!

That's off by 0x2a0 (AKA 672) bytes, and it seems to be consistent from run to
run.

I note that an interrupt occuring could cause similar (since on arm64 those are
taken/triaged on the task stack before moving to the irq stack, and the irq
regs alone will take 300+ bytes), but that doesn't seem to be the problem here
given this is consistent, and it appears some prior function consumed a lot of
stack.

I *think* the same irq problem would apply to x86, but maybe that initial
triage happens on a trampoline stack.

I'll dig a bit more into the arm64 side...

Thanks,
Mark.