[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A1C9DFF.70708@cn.fujitsu.com>
Date: Wed, 27 May 2009 09:57:19 +0800
From: Lai Jiangshan <laijs@...fujitsu.com>
To: paulmck@...ux.vnet.ibm.com
CC: linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
netfilter-devel@...r.kernel.org, mingo@...e.hu,
akpm@...ux-foundation.org, torvalds@...ux-foundation.org,
davem@...emloft.net, dada1@...mosbay.com, zbr@...emap.net,
jeff.chua.linux@...il.com, paulus@...ba.org, jengelh@...ozas.de,
r000n@...0n.net, benh@...nel.crashing.org,
mathieu.desnoyers@...ymtl.ca
Subject: Re: [PATCH RFC] v7 expedited "big hammer" RCU grace periods
Paul E. McKenney wrote:
>
> I am concerned about the following sequence of events:
>
> o synchronize_sched_expedited() disables preemption, thus blocking
> offlining operations.
>
> o CPU 1 starts offlining CPU 0. It acquires the CPU-hotplug lock,
> and proceeds, and is now waiting for preemption to be enabled.
>
> o synchronize_sched_expedited() disables preemption, sees
> that CPU 0 is online, so initializes and queues a request,
> does a wake-up-process(), and finally does a preempt_enable().
>
> o CPU 0 is currently running a high-priority real-time process,
> so the wakeup does not immediately happen.
>
> o The offlining process completes, including the kthread_stop()
> to the migration task.
>
> o The migration task wakes up, sees kthread_should_stop(),
> and so exits without checking its queue.
>
> o synchronize_sched_expedited() waits forever for CPU 0 to respond.
>
> I suppose that one way to handle this would be to check for the CPU
> going offline before doing the wait_for_completion(), but I am concerned
> about races affecting this check as well.
>
> Or is there something in the CPU-offline process that makes the above
> sequence of events impossible?
>
> Thanx, Paul
>
>
I realized this, I wrote this:
>
> The coupling of synchronize_sched_expedited() and migration_req
> is largely increased:
>
> 1) The offline cpu's per_cpu(rcu_migration_req, cpu) is handled.
> See migration_call::CPU_DEAD
synchronize_sched_expedited() will not wait for CPU#0, because
migration_call()::case CPU_DEAD wakes up the requestors.
migration_call()
{
...
case CPU_DEAD:
case CPU_DEAD_FROZEN:
...
/*
* No need to migrate the tasks: it was best-effort if
* they didn't take sched_hotcpu_mutex. Just wake up
* the requestors.
*/
spin_lock_irq(&rq->lock);
while (!list_empty(&rq->migration_queue)) {
struct migration_req *req;
req = list_entry(rq->migration_queue.next,
struct migration_req, list);
list_del_init(&req->list);
spin_unlock_irq(&rq->lock);
complete(&req->done);
spin_lock_irq(&rq->lock);
}
spin_unlock_irq(&rq->lock);
...
...
}
My approach depend on the requestors are waked up at any case.
migration_call() does it for us but the coupling is largely
increased.
Lai
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists