linux-kernel - Re: Regression - locking (all from 2.6.28)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 03 Mar 2009 19:12:41 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	jan sonnek <ha2nny@...il.com>, linux-kernel@...r.kernel.org,
	viro@...iv.linux.org.uk, Catalin Marinas <catalin.marinas@....com>,
	Wu Fengguang <fengguang.wu@...el.com>
Subject: Re: Regression - locking (all from 2.6.28)

On Mon, 2009-03-02 at 12:11 -0800, Andrew Morton wrote:

> > Mar  1 00:07:03 localhost kernel: [   86.440261] =========================================================
> > Mar  1 00:07:03 localhost kernel: [   86.440266] [ INFO: possible irq lock inversion dependency detected ]
> > Mar  1 00:07:03 localhost kernel: [   86.440271] 2.6.29-rc6-mm1-hanny #17
> > Mar  1 00:07:03 localhost kernel: [   86.440273] ---------------------------------------------------------
> 
> I stared at this for a while, but my brain broke trying to work out
> what lockdep is trying to tell us.
> 
> > Mar  1 00:07:03 localhost kernel: [   86.440277] Xorg/2733 just changed the state of lock:
> > Mar  1 00:07:03 localhost kernel: [   86.440280]  (fasync_lock){.-....}, at: [<c01952bb>] kill_fasync+0x20/0x3a
> > Mar  1 00:07:03 localhost kernel: [   86.440292] but this lock took another, HARDIRQ-READ-irq-unsafe lock in the past:
> > Mar  1 00:07:03 localhost kernel: [   86.440296]  (&f->f_lock){+.+...}
> 
> This message needs help.  A lock cannot "take" another lock. 

It seemed a simple enough way to tell that the latter lock nests inside
the former lock.

So what its saying is that we have:

  fasync_lock
    f->f_lock

nesting, and fasync_lock got used in hardirq context, but the lock that
was previously found to nest inside, was an IRQ-unsafe lock.

So $CODE code take f->f_lock, then IRQ could happen and fasync_lock,
f->f_lock could happen and we'd be stuck.

Would something like:

"but this lock had a %s-irq-unsafe nestee in the past:" read better?

> And why
> is f_lock described as "HARDIRQ-READ-irq-unsafe"?  It's a spinlock and
> the "READ" part is not relevant.

I think that's a bug due to the recent irq state tracking generalization
patches, will hunt.

> > Mar  1 00:07:03 localhost kernel: [   86.440299] 
> > Mar  1 00:07:03 localhost kernel: [   86.440300] and interrupts could create inverse lock ordering between them.
> > Mar  1 00:07:03 localhost kernel: [   86.440302] 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/