[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20061210082616.GB14057@elte.hu>
Date: Sun, 10 Dec 2006 09:26:16 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Andrew Morton <akpm@...l.org>
Cc: vatsa@...ibm.com, Bjorn Helgaas <bjorn.helgaas@...com>,
linux-kernel@...r.kernel.org, Myron Stowe <myron.stowe@...com>,
Jens Axboe <axboe@...nel.dk>, Dipankar <dipankar@...ibm.com>,
Gautham shenoy <ego@...ibm.com>
Subject: Re: workqueue deadlock
* Andrew Morton <akpm@...l.org> wrote:
> > > Could do, not sure. I'm planning on converting all the locking
> > > around here to preempt_disable() though.
> >
> > please at least use an owner-recursive per-CPU lock,
>
> a wot?
something like the pseudocode further below - when applied to a data
structure it has semantics and scalability close to that of
preempt_disable(), but it is still preemptible and the lock is specific.
> > not a naked preempt_disable()! The concurrency rules for data
> > structures changed via preempt_disable() are quite hard to sort out
> > after the fact. (preempt_disable() is too opaque,
>
> preempt_disable() is the preferred way of holding off cpu hotplug.
well, preempt_disable() is the scheduler's internal mechanism to keep
tasks from being preempted. It is fast but it also has non-nice
side-effect:
1) nothing tells us what the connection between preempt-disable and data
structure is. If we let preempt_disable() spread then we'll end up
with a situation like the BKL: all preempt_disable() sections become
one big blob of code with hard-to-define specifications, and if we
take out code from that blob stuff mysteriously breaks. If on the
other hand we use a specific lock, that is visible in the source
already (and can be programmatically attached to the data structure
too, like i did it in the PI-list debugging code, where the
connection between the lock and the list is explicit and the
debugging code checks that the lock is held when the data structure
is accessed.)
2) it disables 'all preemption', not 'CPU hotplug down on this CPU' - so
it's a cannon to shoot at sparrows.
> > it doesnt attach data structure to critical section, like normal
> > locks do.)
>
> the data structure is the CPU, and its per-cpu data. And cpu_online_map.
The abstract data structure of CPU hotplug is /not/ "the CPU" but:
a CPU's ability to run tasks, expressed via cpu_online_map, the
runqueues, the per-CPU systems kthreads and all the other hotplug
data structures.
"CPU hotplug code" is the implementation of that data structure, used
via the 'bring CPU down' and 'CPU up' methods of the data structure.
preempt_disable() on the other hand is a scheduler-internal method that
keeps a context from being forcibly rescheduled. As such it is a very
broad superset of some of CPU-hotplug's requirements (because if a CPU
has a task that cannot be rescheduled it then obviously the property of
the CPU to run tasks is frozen), but IMO totally unsuitable for use as a
real hotplug API. The two things are really quite fundamentally
different.
the 'natural' locking method of the CPU hotplug data structures is:
"keep this CPU from being brought down". The pseudocode below expresses
this and only this.
struct task_struct {
...
int hotplug_depth;
struct mutex *hotplug_lock;
}
...
DEFINE_PER_CPU(struct mutex, hotplug_lock);
void cpu_hotplug_lock(void)
{
int cpu = raw_smp_processor_id();
/*
* Interrupts/softirqs are hotplug-safe:
*/
if (in_interrupt())
return;
if (current->hotplug_depth++)
return;
current->hotplug_lock = &per_cpu(hotplug_lock, cpu);
mutex_lock(current->hotplug_lock);
}
void cpu_hotplug_unlock(void)
{
int cpu;
if (in_interrupt())
return;
if (DEBUG_LOCKS_WARN_ON(!current->hotplug_depth))
return;
if (--current->hotplug_depth)
return;
mutex_unlock(current->hotplug_lock);
current->hotplug_lock = NULL;
}
...
void do_exit(void)
{
...
DEBUG_LOCKS_WARN_ON(current->hotplug_depth);
...
}
...
copy_process(void)
{
...
p->hotplug_depth = 0;
p->hotplug_lock = NULL;
...
}
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists