[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20071017115723.GA143@tv-sign.ru>
Date: Wed, 17 Oct 2007 15:57:23 +0400
From: Oleg Nesterov <oleg@...sign.ru>
To: Gautham R Shenoy <ego@...ibm.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org,
Srivatsa Vaddagiri <vatsa@...ibm.com>,
Rusty Russel <rusty@...tcorp.com.au>,
Dipankar Sarma <dipankar@...ibm.com>,
Ingo Molnar <mingo@...e.hu>,
Paul E McKenney <paulmck@...ibm.com>
Subject: Re: [RFC PATCH 4/4] Remove CPU_DEAD/CPU_UP_CANCELLED handling from workqueue.c
On 10/16, Gautham R Shenoy wrote:
>
> cleanup_workqueue_thread() in the CPU_DEAD and CPU_UP_CANCELLED path
> will cause a deadlock if the worker thread is executing a work item
> which is blocked on get_online_cpus(). This will lead to a irrecoverable
> hang.
>
> Solution is not to cleanup the worker thread. Instead let it remain
> even after the cpu goes offline. Since no one can queue any work
> on an offlined cpu, this thread will be forever sleeping, untill
> someone onlines the cpu.
Imho: not perfect, but acceptable. We can re-introduce cwq->should_stop
flag, so that cwq->thread eventually exits after it flushed ->worklist.
But see below.
> @@ -679,6 +680,7 @@ init_cpu_workqueue(struct workqueue_stru
> spin_lock_init(&cwq->lock);
> INIT_LIST_HEAD(&cwq->worklist);
> init_waitqueue_head(&cwq->more_work);
> + cwq->thread = NULL;
Not needed, cwq->thread == NULL because alloc_percpu() uses GFP_ZERO.
> return cwq;
> }
> @@ -712,7 +714,7 @@ static void start_workqueue_thread(struc
>
> if (p != NULL) {
> if (cpu >= 0)
> - kthread_bind(p, cpu);
> + set_cpus_allowed(p, cpumask_of_cpu(cpu));
OK, this is understandable, but see below...
> wake_up_process(p);
> }
> }
> @@ -848,6 +850,9 @@ static int __devinit workqueue_cpu_callb
>
> switch (action) {
> case CPU_UP_PREPARE:
> + if (likely(cwq->thread != NULL &&
> + !IS_ERR(cwq->thread)))
IS_ERR(cwq->thread) is not possible. cwq->thread is either NULL, or
a valid task_struct.
> + break;
> if (!create_workqueue_thread(cwq, cpu))
> break;
> printk(KERN_ERR "workqueue [%s] for %i failed\n",
> @@ -858,12 +863,6 @@ static int __devinit workqueue_cpu_callb
> case CPU_ONLINE:
> start_workqueue_thread(cwq, cpu);
> break;
> -
> - case CPU_UP_CANCELED:
> - start_workqueue_thread(cwq, -1);
> - case CPU_DEAD:
> - cleanup_workqueue_thread(cwq, cpu);
> - break;
> }
> }
Unfortunately, this approach seem to have a sublte problem.
Suppose that CPU 0 goes down. Let's denote the corresponding cwq->thread as T.
When CPU goes offline, T is not bound to CPU 0 any longer. So far this is OK.
Now suppose that CPU 0 goes online again. There is a window before CPU_ONLINE
(note that CPU 0 is already online at this time) binds T back to CPU 0. It is
possible that another thread running on CPU 0 does schedule_work() in that window,
or it can run on any CPU but calls schedule_delayed_work_on(0).
In this case the work_struct may run on the wrong CPU because T is ready to
execute the work_struct, this is bug. work->func() has all rights to expect
that cwq->thread is bound to the right CPU.
Let's take cache_reap() as an example. It is critical that this function is
really "per cpu", otherwise it can use the wrong per-cpu data, and even the
last schedule_delayed_work() is not reliable.
(Sorry, have no time to read the other patches now, will do on weekend).
Oleg.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists