linux-kernel - Re: [PATCH] fix rcu vs hotplug race

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080624110144.GA8695@elte.hu>
Date:	Tue, 24 Jun 2008 13:01:44 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Gautham R Shenoy <ego@...ibm.com>
Cc:	Dhaval Giani <dhaval@...ux.vnet.ibm.com>,
	paulmck@...ux.vnet.ibm.com, Dipankar Sarma <dipankar@...ibm.com>,
	laijs@...fujitsu.com, Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	lkml <linux-kernel@...r.kernel.org>,
	"Paul E. McKenney" <paulmck@...ibm.com>
Subject: Re: [PATCH] fix rcu vs hotplug race


* Gautham R Shenoy <ego@...ibm.com> wrote:

> > hm, not sure - we might just be fighting the symptom and we might 
> > now create a silent resource leak instead. Isnt a full RCU quiescent 
> > state forced (on all CPUs) before a CPU is cleared out of 
> > cpu_online_map? That way the to-be-offlined CPU should never 
> > actually show up in rcp->cpumask.
> 
> No, this does not happen currently. The rcp->cpumask is always 
> initialized to cpu_online_map&~nohz_cpu_mask when we start a new 
> batch. Hence, before the batch ends, if a cpu goes offline we _can_ 
> have a stale rcp->cpumask, till the RCU subsystem has handled it's 
> CPU_DEAD notification.
> 
> Thus for a tiny interval, the rcp->cpumask would contain the offlined 
> CPU. One of the alternatives is probably to handle this using 
> CPU_DYING notifier instead of CPU_DEAD where we can call 
> __rcu_offline_cpu().
> 
> The warn_on that dhaval was hitting was because of some cpu-offline 
> that was called just before we did a local_irq_save inside call_rcu(). 
> But at that time, the rcp->cpumask was still stale, and hence we ended 
> up sending a smp_reschedule() to an offlined cpu. So the check may not 
> create any resource leak.

the check may not - but the problem it highlights might and with the 
patch we'd end up hiding potential problems in this area.

Paul, what do you think about this mixed CPU hotplug plus RCU workload?

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/