[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKU+Ga_awbok9wTjbnrNbwqGcF0sagHpf729ZDxYdNVPMhc81A@mail.gmail.com>
Date: Wed, 28 Aug 2013 16:11:56 -0700
From: Jamie Liu <jamieliu@...gle.com>
To: Tejun Heo <tj@...nel.org>
Cc: Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] workqueue: cond_resched() after processing each work item
Hi Tejun,
On Wed, Aug 28, 2013 at 2:33 PM, Tejun Heo <tj@...nel.org> wrote:
> Would something like the following work? Can you please verify it?
I confirm that this works.
> Thanks.
> ---- 8< ----
> If !PREEMPT, a kworker running work items back to back can hog CPU.
> This becomes dangerous when a self-requeueing work item which is
> waiting for something to happen races against stop_machine. Such
> self-requeueing work item would requeue itself indefinitely hogging
> the kworker and CPU it's running on while stop_machine would wait for
> that CPU to enter stop_machine while preventing anything else from
> happening on all other CPUs. The two would deadlock.
>
> Jmamie Liu reports that this deadlock scenario exists around
s/Jmamie/Jamie/
> scsi_requeue_run_queue() and libata port multiplier support, where one
> port may exclude command processing from other ports. With the right
> timing, scsi_requeue_run_queue() can end up requeueing itself trying
> to execute an IO which is asked to be retried while another device has
> an exclusive access, which in turn can't make forward progress due to
> stop_machine.
>
> Fix it by invoking cond_resched() after executing each work item.
>
> Signed-off-by: Tejun Heo <tj@...nel.org>
> Reported-by: Jamie Liu <jamieliu@...gle.com>
> References: http://thread.gmane.org/gmane.linux.kernel/1552567
> Cc: stable@...r.kernel.org
> --
> kernel/workqueue.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index f02c4a4..73b662b 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2201,6 +2201,15 @@ __acquires(&pool->lock)
> dump_stack();
> }
>
> + /*
> + * The following prevents a kworker from hogging CPU on !PREEMPT
> + * kernels, where a requeueing work item waiting for something to
> + * happen could deadlock with stop_machine as such work item could
> + * indefinitely requeue itself while all other CPUs are trapped in
> + * stop_machine.
> + */
> + cond_resched();
> +
> spin_lock_irq(&pool->lock);
>
> /* clear cpu intensive status */
Thanks,
Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists