lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090518221403.GA7393@redhat.com>
Date:	Tue, 19 May 2009 00:14:03 +0200
From:	Oleg Nesterov <oleg@...hat.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	Johannes Berg <johannes@...solutions.net>,
	Ingo Molnar <mingo@...e.hu>,
	Zdenek Kabelac <zdenek.kabelac@...il.com>,
	"Rafael J. Wysocki" <rjw@...k.pl>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Gautham R Shenoy <ego@...ibm.com>
Subject: Re: INFO: possible circular locking dependency at
	cleanup_workqueue_thread

On 05/18, Peter Zijlstra wrote:
>
> On Mon, 2009-05-18 at 22:16 +0200, Oleg Nesterov wrote:
> > On 05/18, Peter Zijlstra wrote:
> > >
> > > On Mon, 2009-05-18 at 21:47 +0200, Oleg Nesterov wrote:
> > > >
> > > > This output looks obviously wrong, Z does not depend on L1 or any
> > > > other lock.
> > >
> > > It does, L1 -> L2 -> Z as per 1 and 2
> > > which 3 obviously reverses.
> >
> > Yes, yes, I see. And, as I said, I can't explain what I mean.
> >
> > I mean... The output above looks as if we take L1 and Z in wrong order.
> > But Z has nothing to do with this deadlock, it can't depend on any lock
> > from the correctness pov. Except yes, we have it in L1->L2->Z->L1 cycle.
>
> AB-BC-CA deadlock
>
> Thread 1		Thread 2		Thread 3
>
> L(L1)
> 			L(L2)
> 						L(Z)
> L(L2)
> 			L(Z)
> 						L(L1)

Sure. Now Z really depends on L1. But if you change Thread 3 to take yet
another unique lock X under Z, then lockdep will complain that X depends
on L1, not Z.

To clarify, I do not say the output is bugggy. I only meant it could be
better. But I don't understand how to improve it.

If we return to the original bug report, perhaps cpu_add_remove_lock
has nothing to do with this problem... we could have the similar output
if device_pm_lock() is called from work_struct.

> And you're saying, we can't have that deadlock because we don't have the
> 3 separate functions?

No,

> That is, there is no concurrency on Z because its always taken under L2?

Yes, nobody else can hold Z when we take L2.

But this wasn't my point.

> For those situations we have the spin_lock_nest_lock(X, y) annotation,
> where we say, there cannot be any concurrency on x element of X, because
> all such locks are always taken under y.

We can just kill L(Z) instead of annotating, this changes nothing from
the correctness pov, we have the same deadlock. But the output becomes
very clear: L1 depends on L2.


OK, please forget. Not sure why I started this thread. Just because I
was surprised a bit when I figured out that lockdep's output does not
match my naive expectations.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ