[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEXW_YQEarLt7YGQZdwmcSyZcGRCGKf89ovxjQdXBO-TgXAk-w@mail.gmail.com>
Date: Tue, 25 Apr 2023 07:06:08 -0400
From: Joel Fernandes <joel@...lfernandes.org>
To: Zhouyi Zhou <zhouzhouyi@...il.com>,
Christophe Leroy <christophe.leroy@....fr>
Cc: Peter Zijlstra <peterz@...radead.org>,
Boqun Feng <boqun.feng@...il.com>,
Segher Boessenkool <segher@...nel.crashing.org>,
Michael Ellerman <mpe@...erman.id.au>,
linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
rcu <rcu@...r.kernel.org>,
linux-kernel <linux-kernel@...r.kernel.org>, lance@...osl.org,
"Paul E. McKenney" <paulmck@...nel.org>
Subject: Re: BUG : PowerPC RCU: torture test failed with __stack_chk_fail
On Tue, Apr 25, 2023 at 6:58 AM Zhouyi Zhou <zhouzhouyi@...il.com> wrote:
>
> hi
>
> On Tue, Apr 25, 2023 at 6:13 PM Peter Zijlstra <peterz@...radead.org> wrote:
> >
> > On Mon, Apr 24, 2023 at 02:55:11PM -0400, Joel Fernandes wrote:
> > > This is amazing debugging Boqun, like a boss! One comment below:
> > >
> > > > > > Or something simple I haven't thought of? :)
> > > > >
> > > > > At what points can r13 change? Only when some particular functions are
> > > > > called?
> > > > >
> > > >
> > > > r13 is the local paca:
> > > >
> > > > register struct paca_struct *local_paca asm("r13");
> > > >
> > > > , which is a pointer to percpu data.
> > > >
> > > > So if a task schedule from one CPU to anotehr CPU, the value gets
> > > > changed.
> > >
> > > It appears the whole issue, per your analysis, is that the stack
> > > checking code in gcc should not cache or alias r13, and must read its
> > > most up-to-date value during stack checking, as its value may have
> > > changed during a migration to a new CPU.
> > >
> > > Did I get that right?
> > >
> > > IMO, even without a reproducer, gcc on PPC should just not do that,
> > > that feels terribly broken for the kernel. I wonder what clang does,
> > > I'll go poke around with compilerexplorer after lunch.
> > >
> > > Adding +Peter Zijlstra as well to join the party as I have a feeling
> > > he'll be interested. ;-)
> >
> > I'm a little confused; the way I understand the whole stack protector
> > thing to work is that we push a canary on the stack at call and on
> > return check it is still valid. Since in general tasks randomly migrate,
> > the per-cpu validation canary should be the same on all CPUs.
> >
> > Additionally, the 'new' __srcu_read_{,un}lock_nmisafe() functions use
> > raw_cpu_ptr() to get 'a' percpu sdp, preferably that of the local cpu,
> > but no guarantees.
> >
> > Both cases use r13 (paca) in a racy manner, and in both cases it should
> > be safe.
> New test results today: both gcc build from git (git clone
> git://gcc.gnu.org/git/gcc.git) and Ubuntu 22.04 gcc-12.1.0
> are immune from the above issue. We can see the assembly code on
> http://140.211.169.189/0425/srcu_gp_start_if_needed-gcc-12.txt
>
> while
> Both native gcc on PPC vm (gcc version 9.4.0), and gcc cross compiler
> on my x86 laptop (gcc version 10.4.0) will reproduce the bug.
Do you know what fixes the issue? I would not declare victory yet. My
feeling is something changes in timing, or compiler codegen which
hides the issue. So the issue is still there but it is just a matter
of time before someone else reports it.
Out of curiosity for PPC folks, why cannot 64-bit PPC use per-task
canary? Michael, is this an optimization? Adding Christophe as well
since it came in a few years ago via the following commit:
commit 06ec27aea9fc84d9c6d879eb64b5bcf28a8a1eb7
Author: Christophe Leroy <christophe.leroy@....fr>
Date: Thu Sep 27 07:05:55 2018 +0000
powerpc/64: add stack protector support
On PPC64, as register r13 points to the paca_struct at all time,
this patch adds a copy of the canary there, which is copied at
task_switch.
That new canary is then used by using the following GCC options:
-mstack-protector-guard=tls
-mstack-protector-guard-reg=r13
-mstack-protector-guard-offset=offsetof(struct paca_struct, canary))
Signed-off-by: Christophe Leroy <christophe.leroy@....fr>
Signed-off-by: Michael Ellerman <mpe@...erman.id.au>
- Joel
Powered by blists - more mailing lists