lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20091106075820.GA28227@elte.hu>
Date:	Fri, 6 Nov 2009 08:58:20 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Tejun Heo <tj@...nel.org>, Nick Piggin <npiggin@...e.de>
Cc:	Jiri Kosina <jkosina@...e.cz>,
	Peter Zijlstra <peterz@...radead.org>,
	Yinghai Lu <yhlu.kernel@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>, cl@...ux-foundation.org,
	linux-kernel@...r.kernel.org
Subject: Re: irq lock inversion


* Tejun Heo <tj@...nel.org> wrote:

> Ingo Molnar wrote:
> >>> This warning is bogus -- sched_init() is being called very early with IRQs
> >>> disabled, and the irqsave/restore code paths in pcpu_alloc() are only for early
> >>> init. The path can never be called from irq context once the early init
> >>> finishes. Rationale for this is explained in changelog of the commit mentioned
> >>> above.
> >>>
> >>> This problem can be encountered generally in any other early code running
> >>> with IRQs off and using irqsave/irqrestore.
> >>>
> >>> Reported-by: Yinghai Lu <yhlu.kernel@...il.com>
> >>> Signed-off-by: Jiri Kosina <jkosina@...e.cz>
> >> Looks good to me.  Ingo, what do you think?
> > 
> > Ugh, this explanation is _BOGUS_. As i said, taking a lock with irqs 
> > disabled does _NOT_ mark a lock as 'irq safe' - if it did, we'd have 
> > false positives left and right.
> > 
> > Read the lockdep message please, consider all the backtraces it prints, 
> > it says something different.
> 
> Ah... okay, the pcpu_free() path is correctly marking the lock 
> irqsafe.  I assumed this was caused by recent pcpu_alloc() change. 
> Sorry about that.  The lock inversion problem has always been there, 
> it just never showed up because none has use allocation map that large 
> I suppose.
> 
> So, the correct fix would be either 1. push down irqsafeness down to 
> vmalloc locks or 2. the rather ugly unlock-lock dancing in 
> pcpu_extend_area_map() I posted earlier.  For 2.6.32, I guess we'll 
> have to go with #2.  For longer term, we'll probably have to do #1 as 
> it's required to implement atomic percpu allocations too.
> 
> I'll try to reproduce the problem here and verify the previous locking 
> dance patch.

I havent looked deeply but at first sight i'm not 100% sure that even 
the lock dance hack is safe - doesnt vfree() do TLB flushes, which must 
be done with irqs enabled in general? If yes, then the whole notion of 
using the allocator from irqs-off sections is wrong and the flags 
save/restore is misguided (or at least incomplete).

So the real problem right now i think is the use of the pcpu allocator 
from within a BH section (and from irqs-off sections) - that usage 
should be eliminated from .32, or the allocator should be fixed. (which 
looks non-trivial vmalloc/vfree was never really intended to be used in 
irq-atomic contexts)

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ