linux-kernel - Re: [PATCH 05/16] rcu: De-offloading CB kthread

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20201104144209.GA2748545@boqun-archlinux>
Date:   Wed, 4 Nov 2020 22:42:09 +0800
From:   Boqun Feng <boqun.feng@...il.com>
To:     Frederic Weisbecker <frederic@...nel.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        Neeraj Upadhyay <neeraju@...eaurora.org>,
        Joel Fernandes <joel@...lfernandes.org>,
        Josh Triplett <josh@...htriplett.org>
Subject: Re: [PATCH 05/16] rcu: De-offloading CB kthread

On Wed, Nov 04, 2020 at 03:31:35PM +0100, Frederic Weisbecker wrote:
[...]
> > 
> > > +	rcu_segcblist_offload(cblist, false);
> > > +	raw_spin_unlock_rcu_node(rnp);
> > > +
> > > +	if (rdp->nocb_cb_sleep) {
> > > +		rdp->nocb_cb_sleep = false;
> > > +		wake_cb = true;
> > > +	}
> > > +	rcu_nocb_unlock_irqrestore(rdp, flags);
> > > +
> > > +	if (wake_cb)
> > > +		swake_up_one(&rdp->nocb_cb_wq);
> > > +
> > > +	swait_event_exclusive(rdp->nocb_state_wq,
> > > +			      !rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB));
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static long rcu_nocb_rdp_deoffload(void *arg)
> > > +{
> > > +	struct rcu_data *rdp = arg;
> > > +
> > > +	WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
> > 
> > I think this warning can actually happen, if I understand how workqueue
> > works correctly. Consider that the corresponding cpu gets offlined right
> > after the rcu_nocb_cpu_deoffloaed(), and the workqueue of that cpu
> > becomes unbound, and IIUC, workqueues don't do migration during
> > cpu-offlining, which means the worker can be scheduled to other CPUs,
> > and the work gets executed on another cpu. Am I missing something here?.
> 
> We are holding cpus_read_lock() in rcu_nocb_cpu_offload(), this should
> prevent from that.
> 

But what if the work doesn't get executed until we cpus_read_unlock()
and someone offlines that CPU?

Regards,
Boqun

> Thanks!