lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 03 Jul 2023 18:34:42 +0100
From:   Marc Zyngier <maz@...nel.org>
To:     Conor Dooley <conor@...nel.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Anup Patel <apatel@...tanamicro.com>,
        Palmer Dabbelt <palmer@...osinc.com>,
        lkp <oliver.sang@...el.com>,
        "Liam R. Howlett" <Liam.Howlett@...cle.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com
Subject: Re: [mm] 408579cd62: WARNING:suspicious_RCU_usage

On Mon, 03 Jul 2023 18:20:52 +0100,
Conor Dooley <conor@...nel.org> wrote:
> 
> [1  <text/plain; us-ascii (quoted-printable)>]
> On Mon, Jul 03, 2023 at 10:07:28AM -0700, Linus Torvalds wrote:
> > On Mon, 3 Jul 2023 at 10:00, Conor Dooley <conor@...nel.org> wrote:
> > >
> > > I'm not entirely sure if it is related, as stuff in the guts of mm like
> > > this is beyond me, but I've been seeing similar warnings on RISC-V.
> > 
> > No, that RISC-V warning is also about bad RCU usage, but that's a
> > different thing.
> > 
> > >         RCU used illegally from offline CPU!
> > >         rcu_scheduler_active = 1, debug_locks = 1
> > >         1 lock held by swapper/1/0:
> > >          #0: ffffffff8169ceb0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x0/0x32
> > >
> > >         stack backtrace:
> > >         CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.4.0-10173-ga901a3568fd2 #1
> > >         Hardware name: riscv-virtio,qemu (DT)
> > >         Call Trace:
> > >         [<ffffffff80006a20>] show_stack+0x2c/0x38
> > >         [<ffffffff80af3ee0>] dump_stack_lvl+0x5e/0x80
> > >         [<ffffffff80af3f16>] dump_stack+0x14/0x1c
> > >         [<ffffffff80083ff0>] lockdep_rcu_suspicious+0x19e/0x232
> > >         [<ffffffff80ad4802>] mtree_load+0x18a/0x3b6
> > >         [<ffffffff80091632>] __irq_get_desc_lock+0x2c/0x82
> > >         [<ffffffff80094722>] enable_percpu_irq+0x36/0x9e
> > >         [<ffffffff800087d4>] riscv_ipi_enable+0x32/0x4e
> > >         [<ffffffff80008692>] smp_callin+0x24/0x66
> > 
> > This is also triggering on the maple tree sanity checks, but it' sa
> > different maple tree, and a different code sequence.
> > 
> > And a different case of suspicious RCU usage - not a lack of locking,
> > but simply using RCU before marking the CPU online.
> 
> Ah, I probably should've known from the
>          RCU used illegally from offline CPU!
> that it was different.
> 
> > I suspect the riscv_ipi_enable() in the RISC-V version of smp_callin()
> > needs to be moved down to below the
> > 
> >         set_cpu_online(curr_cpuid, 1);
> > 
> > or was there some reason why it needed to be done quite _that_ early
> > in commit 832f15f42646 ("RISC-V: Treat IPIs as normal Linux IRQs")?
> > 
> > Added guilty parties to the cc.
> 
> Taking the rationale & potential problems out of the equation, that
> code movement does suppress the complaints from rcu/maple tree,
> thanks.

Comparing with what we do on arm64, a less radical change would be to
move the IPI init after notify_cpu_starting(), which explicitly
enables RCU usage.

Something like:

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index bb0b76e1a6d4..f4d6acb38dd0 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -238,10 +238,11 @@ asmlinkage __visible void smp_callin(void)
 	mmgrab(mm);
 	current->active_mm = mm;
 
-	riscv_ipi_enable();
-
 	store_cpu_topology(curr_cpuid);
 	notify_cpu_starting(curr_cpuid);
+
+	riscv_ipi_enable();
+
 	numa_add_cpu(curr_cpuid);
 	set_cpu_online(curr_cpuid, 1);
 	probe_vendor_features(curr_cpuid);

which I obviously haven't tested at all.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ