linux-kernel - Re: arm64: stacktrace: unwind exception boundaries

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <aieiy7cowkk7ygkhpquggbkpzn5w4prlr5ujjuawpsw4re6467@iwuu4r32lr7r>
Date: Tue, 10 Dec 2024 14:31:17 -0500
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Mark Rutland <mark.rutland@....com>
Cc: linux-arm-kernel@...ts.infradead.org, linux-bcachefs@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: arm64: stacktrace: unwind exception boundaries

On Tue, Dec 10, 2024 at 07:17:36PM +0000, Mark Rutland wrote:
> On Tue, Dec 10, 2024 at 07:40:04AM -0500, Kent Overstreet wrote:
> > On Tue, Dec 10, 2024 at 11:27:19AM +0000, Mark Rutland wrote:
> > > On Mon, Dec 09, 2024 at 11:37:12AM +0000, Mark Rutland wrote:
> > > > On Thu, Dec 05, 2024 at 01:04:59PM -0500, Kent Overstreet wrote:
> 
> > > Looking some more, I see that bch2_btree_transactions_read() is trying
> > > to unwind other tasks, and I believe what's happening here is that the
> > > unwindee isn't actually blocked for the duration of the unwind, leading
> > > to the unwinder encountering junk and consequently producing the
> > > warning.
> > > 
> > > As a test case, it's possible to trigger similar with a few parallel
> > > instances of:
> > > 
> > > 	while true; do cat /proc/*/stack > /dev/null
> > > 
> > > The only thing we can do on the arm64 side is remove the WARN_ON_ONCE(),
> > > which'll get rid of the splat. It seems we've never been unlucky enough
> > > to hit a stale fgraph entry, or that would've blown up also.
> > > 
> > > Regardless of the way arm64 behaves here, the unwind performed by
> > > bch2_btree_transactions_read() is going to contain garbage unless the
> > > task is pinned in a blocked state. AFAICT the way
> > > btree_trans::locking_wait::task is used is here is racy, and there's no
> > > guarantee that the unwindee is actually blocked.
> > 
> > Occasionally returning garbage is completely fine, as long as the
> > interface is otherwise safe. This is debug info; it's important that it
> > be available and we can't impose additional synchronization for it.
> 
> Sure thing; just note that there's no guarantee that this is only
> "occasionally" garbage -- this could be wrong 1% of the time or 99% of
> the time depending on the specific scenario, HW it's running on, etc. As
> long as you're happy to hold the pieces when that happens, that's fine.
> 
> I've pushed out fixes to the arm64/stacktrace/fixes branch on my
> kernel.org git repo:
> 
>   https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/stacktrace/fixes
>   git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git arm64/stacktrace/fixes
> 
> ... and I'll get that out as a series on the list tomorrow.

Wonderful

The fact that /proc/pid/stack doesn't give anything if the process is in
TASK_RUNNING has been a problem in the past, yet this is something we're
not consistent on in the kernel; sysrq-trigger backtraces do give
backtraces for running processes (which then may require sorting through
everything).

So thanks for this :)