lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 28 Aug 2013 16:11:56 -0700
From:	Jamie Liu <jamieliu@...gle.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] workqueue: cond_resched() after processing each work item

Hi Tejun,

On Wed, Aug 28, 2013 at 2:33 PM, Tejun Heo <tj@...nel.org> wrote:
> Would something like the following work?  Can you please verify it?

I confirm that this works.

> Thanks.
> ---- 8< ----
> If !PREEMPT, a kworker running work items back to back can hog CPU.
> This becomes dangerous when a self-requeueing work item which is
> waiting for something to happen races against stop_machine.  Such
> self-requeueing work item would requeue itself indefinitely hogging
> the kworker and CPU it's running on while stop_machine would wait for
> that CPU to enter stop_machine while preventing anything else from
> happening on all other CPUs.  The two would deadlock.
>
> Jmamie Liu reports that this deadlock scenario exists around

s/Jmamie/Jamie/

> scsi_requeue_run_queue() and libata port multiplier support, where one
> port may exclude command processing from other ports.  With the right
> timing, scsi_requeue_run_queue() can end up requeueing itself trying
> to execute an IO which is asked to be retried while another device has
> an exclusive access, which in turn can't make forward progress due to
> stop_machine.
>
> Fix it by invoking cond_resched() after executing each work item.
>
> Signed-off-by: Tejun Heo <tj@...nel.org>
> Reported-by: Jamie Liu <jamieliu@...gle.com>
> References: http://thread.gmane.org/gmane.linux.kernel/1552567
> Cc: stable@...r.kernel.org
> --
>  kernel/workqueue.c |    9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index f02c4a4..73b662b 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2201,6 +2201,15 @@ __acquires(&pool->lock)
>                 dump_stack();
>         }
>
> +       /*
> +        * The following prevents a kworker from hogging CPU on !PREEMPT
> +        * kernels, where a requeueing work item waiting for something to
> +        * happen could deadlock with stop_machine as such work item could
> +        * indefinitely requeue itself while all other CPUs are trapped in
> +        * stop_machine.
> +        */
> +       cond_resched();
> +
>         spin_lock_irq(&pool->lock);
>
>         /* clear cpu intensive status */

Thanks,
Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ