lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230425101324.GD1331236@hirez.programming.kicks-ass.net>
Date:   Tue, 25 Apr 2023 12:13:24 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Joel Fernandes <joel@...lfernandes.org>
Cc:     Boqun Feng <boqun.feng@...il.com>,
        Segher Boessenkool <segher@...nel.crashing.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        Zhouyi Zhou <zhouzhouyi@...il.com>,
        linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
        rcu <rcu@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>, lance@...osl.org,
        "Paul E. McKenney" <paulmck@...nel.org>
Subject: Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail

On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> This is amazing debugging Boqun, like a boss! One comment below:
> 
> > > > Or something simple I haven't thought of? :)
> > >
> > > At what points can r13 change?  Only when some particular functions are
> > > called?
> > >
> >
> > r13 is the local paca:
> >
> >         register struct paca_struct *local_paca asm("r13");
> >
> > , which is a pointer to percpu data.
> >
> > So if a task schedule from one CPU to anotehr CPU, the value gets
> > changed.
> 
> It appears the whole issue, per your analysis, is that the stack
> checking code in gcc should not cache or alias r13, and must read its
> most up-to-date value during stack checking, as its value may have
> changed during a migration to a new CPU.
> 
> Did I get that right?
> 
> IMO, even without a reproducer, gcc on PPC should just not do that,
> that feels terribly broken for the kernel. I wonder what clang does,
> I'll go poke around with compilerexplorer after lunch.
> 
> Adding +Peter Zijlstra as well to join the party as I have a feeling
> he'll be interested. ;-)

I'm a little confused; the way I understand the whole stack protector
thing to work is that we push a canary on the stack at call and on
return check it is still valid. Since in general tasks randomly migrate,
the per-cpu validation canary should be the same on all CPUs.

Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
but no guarantees.

Both cases use r13 (paca) in a racy manner, and in both cases it should
be safe.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ