lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170309035527.GB6416@lerouge>
Date:   Thu, 9 Mar 2017 04:55:29 +0100
From:   Frederic Weisbecker <fweisbec@...il.com>
To:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:     linux-kernel@...r.kernel.org, mingo@...hat.com,
        peterz@...radead.org, tglx@...utronix.de
Subject: Re: RCU used on incoming CPU before rcu_cpu_starting() called

On Wed, Mar 08, 2017 at 03:41:52PM -0800, Paul E. McKenney wrote:
> On Wed, Mar 08, 2017 at 02:16:56PM -0800, Paul E. McKenney wrote:
> > Hello!
> > 
> > I am seeing the following splat in rcutorture testing of v4.11-rc1:
> > 
> > [   30.694013] =============================
> > [   30.694013] WARNING: suspicious RCU usage
> > [   30.694013] 4.11.0-rc1+ #1 Not tainted
> > [   30.694013] -----------------------------
> > [   30.694013] /home/git/linux-2.6-tip/kernel/workqueue.c:712 sched RCU or wq_pool_mutex should be held!
> > [   30.694013] 
> > [   30.694013] other info that might help us debug this:
> > [   30.694013] 
> > [   30.694013] 
> > [   30.694013] RCU used illegally from offline CPU!
> > [   30.694013] rcu_scheduler_active = 2, debug_locks = 0
> > [   30.694013] no locks held by swapper/1/0.
> > [   30.694013] 
> > [   30.694013] stack backtrace:
> > [   30.694013] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc1+ #1
> > [   30.694013] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> > [   30.694013] Call Trace:
> > [   30.694013]  dump_stack+0x67/0x99
> > [   30.694013]  lockdep_rcu_suspicious+0xe7/0x120
> > [   30.694013]  get_work_pool+0x82/0x90
> > [   30.694013]  __queue_work+0x70/0x5f0
> > [   30.694013]  queue_work_on+0x33/0x70
> > [   30.694013]  clear_sched_clock_stable+0x33/0x40
> > [   30.694013]  early_init_intel+0xe7/0x2f0
> > [   30.694013]  init_intel+0x11/0x350
> > [   30.694013]  identify_cpu+0x344/0x5a0
> > [   30.694013]  identify_secondary_cpu+0x18/0x80
> > [   30.694013]  smp_store_cpu_info+0x39/0x40
> > [   30.694013]  start_secondary+0x4e/0x100
> > [   30.694013]  start_cpu+0x14/0x14
> > 
> > Here is the relevant code from x86's smp_callin():
> > 
> > 	/*
> > 	 * Save our processor parameters. Note: this information
> > 	 * is needed for clock calibration.
> > 	 */
> > 	smp_store_cpu_info(cpuid);
> > 
> > 	/*
> > 	 * Get our bogomips.
> > 	 * Update loops_per_jiffy in cpu_data. Previous call to
> > 	 * smp_store_cpu_info() stored a value that is close but not as
> > 	 * accurate as the value just calculated.
> > 	 */
> > 	calibrate_delay();
> > 	cpu_data(cpuid).loops_per_jiffy = loops_per_jiffy;
> > 	pr_debug("Stack at about %p\n", &cpuid);
> > 
> > 	/*
> > 	 * This must be done before setting cpu_online_mask
> > 	 * or calling notify_cpu_starting.
> > 	 */
> > 	set_cpu_sibling_map(raw_smp_processor_id());
> > 	wmb();
> > 
> > 	notify_cpu_starting(cpuid);
> > 
> > The problem is that smp_store_cpu_info() indirectly invokes
> > schedule_work(), which wants to use RCU.  But RCU isn't informed
> > of the incoming CPU until the call to notify_cpu_starting(), which
> > causes lockdep to complain bitterly about the use of RCU by the
> > premature call to schedule_work().
> > 
> > I considered just moving the notify_cpu_starting() earlier in the
> > sequence, but the comments make it seem like this would not be
> > a wise choice.
> > 
> > Any suggestions?

Calling schedule_work() from an offline (booting) CPU doesn't sound like a good
idea in the first place. And neither is it a good idea to allow using
RCU on a CPU that is not yet online.

Perhaps we could delay this sched clock stability check to a later
stage in the secondary CPU boot-up code? Once the CPU is online
and RCU is initialized? For example it could be a CPU_ONLINE hotplug
callback.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ