linux-kernel - Re: workqueue deadlock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20061210041600.56306676.akpm@osdl.org>
Date:	Sun, 10 Dec 2006 04:16:00 -0800
From:	Andrew Morton <akpm@...l.org>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	vatsa@...ibm.com, Bjorn Helgaas <bjorn.helgaas@...com>,
	linux-kernel@...r.kernel.org, Myron Stowe <myron.stowe@...com>,
	Jens Axboe <axboe@...nel.dk>, Dipankar <dipankar@...ibm.com>,
	Gautham shenoy <ego@...ibm.com>
Subject: Re: workqueue deadlock

On Sun, 10 Dec 2006 12:49:43 +0100
Ingo Molnar <mingo@...e.hu> wrote:

> > > 	void cpu_hotplug_lock(void)

This is actually not cpu-hotplug safe ;)  

> > > 	{
> > > 		int cpu = raw_smp_processor_id();
> > > 		/*
> > > 		 * Interrupts/softirqs are hotplug-safe:
> > > 		 */
> > > 		if (in_interrupt())
> > > 			return;
> > > 		if (current->hotplug_depth++)
> > > 			return;

<preempt, cpu hot-unplug, resume on different CPU>

> > > 		current->hotplug_lock = &per_cpu(hotplug_lock, cpu);

<use-after-free>

> > > 		mutex_lock(current->hotplug_lock);

And it sleeps, so we can't use preempt_disable().

> > > 	}

It's worth noting that this very common sequence:

	preempt_disable();
	cpu = smp_processor_id();
	...
	preempt_enable();

also provides cpu-hotunplug protection against scenarios such as the above.

> > That's functionally equivalent to what we have now, and it isn't 
> > working too well.
> 
> hm, i thought the main reason of not using cpu_hotplug_lock() in a 
> widespread manner was not related to its functionality but to its 
> scalability - but i could be wrong.

It hasn't been noticed yet.

I suspect a large part of the reason for that is that we only really care
about hot-unplug when this CPU reaches across to some other CPU's data.  Often
_all_ other CPU's data.  And that's a super-inefficient thing, so it's rare.

Most of the problems we've had are due to borkage in cpufreq.  And that's
simply cruddy code - it's not due to the complexity of CPU hotplug per-se.

> The one above is scalable and we 
> could use it as /the/ method to control CPU hotplug. All the flux i 
> remember related to cpu_hotplug_lock() use from the fork path and from 
> other scheduler hotpaths related to its scalability bottleneck, not to 
> its locking efficiency.

One quite different way of addressing all of this is to stop using
stop_machine_run() for hotplug synchronisation and switch to the swsusp
freezer infrastructure: all kernel threads and user processes need to stop
and park themselves in a known state before we allow the CPU to be removed.
lock_cpu_hotplug() becomes a no-op.

Dunno if it'll work - I only just thought of it.  It sure would simplify
things.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/