lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070630114658.GA344@tv-sign.ru>
Date:	Sat, 30 Jun 2007 15:46:58 +0400
From:	Oleg Nesterov <oleg@...sign.ru>
To:	Ingo Molnar <mingo@...hat.com>
Cc:	Johannes Berg <johannes@...solutions.net>,
	Arjan van de Ven <arjan@...ux.intel.com>,
	Linux Kernel list <linux-kernel@...r.kernel.org>,
	linux-wireless <linux-wireless@...r.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>, mingo@...e.hu,
	Thomas Sattler <tsattler@....de>
Subject: Re: [RFC/PATCH] debug workqueue deadlocks with lockdep

On 06/30, Ingo Molnar wrote:
>
> On Thu, 2007-06-28 at 19:33 +0200, Johannes Berg wrote:
> > No, that's not right either, but Arjan just helped me a bit with how
> > lockdep works and I think I have the right idea now. Ignore this for
> > now, I'll send a new patch in a few days.
> 
> ok. But in general, this is a very nice idea!
> 
> i've Cc:-ed Oleg. Oleg, what do you think? I think we should keep all
> the workqueue APIs specified in a form that makes them lockdep coverable
> like Johannes did. This debug mechanism could have helped with the
> recent DVB lockup that Thomas Sattler reported.

I think this idea is great!

Johannes, could you change wait_on_work() as well? Most users of
flush_workqueue() should be converted to use it.

> @@ -342,6 +351,9 @@ static int flush_cpu_workqueue(struct cp
>         } else {
>                 struct wq_barrier barr;
>
> +               lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> +               lock_release(&cwq->wq->lockdep_map, 0, _THIS_IP_);
> +
>                 active = 0;
>                 spin_lock_irq(&cwq->lock);

I am not sure why you skip "if (cwq->thread == current)" case, it can
deadlock in the same way.

But, perhaps we should not change flush_cpu_workqueue(). If we detect the
deadlock, we will have num_online_cpus() reports, yes?

And,

>                 if (!list_empty(&cwq->worklist) || cwq->current_work != NULL) {
> @@ -376,6 +388,8 @@ void fastcall flush_workqueue(struct wor
>         int cpu;
>
>         might_sleep();
> +       lock_acquire(&wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> +       lock_release(&wq->lockdep_map, 0, _THIS_IP_);

one of the 2 callers was already modified. Perhaps it is better to add
lock_acquire() into the second caller, cleanup_workqueue_thread(), but
skip flush_cpu_workqueue() ?

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ