lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1210219729.3453.97.camel@ymzhang>
Date:	Thu, 08 May 2008 12:08:49 +0800
From:	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Andi Kleen <andi@...stfloor.org>, Matthew Wilcox <matthew@....cx>,
	Ingo Molnar <mingo@...e.hu>,
	LKML <linux-kernel@...r.kernel.org>,
	Alexander Viro <viro@....linux.org.uk>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: AIM7 40% regression with 2.6.26-rc1


On Wed, 2008-05-07 at 20:29 -0700, Linus Torvalds wrote:
> 
> On Thu, 8 May 2008, Zhang, Yanmin wrote:
> >
> > Congratulations! The patch really fixes the regression completely!
> > vmstat showed cpu idle is 0%, just like 2.6.25's.
> 
> Well, that shows that it was the BKL.
> 
> That said, "idle 0%" is easy when you spin. Do you also have actual 
> performance numbers?
Yes. My conclusion is based on the actual number. cpu idle 0% is just
a behavior it should be.

>  I'd hope that not only do we use full CPU time, it's 
> also at least as fast as the old semaphores were?
Yes.

> 
> While I've been dissing sleeping locks (because their overhead is so 
> high), at least in _theory_ they can get better behavior when not 
> spinning. Now, that's not going to happen with the BKL, I'm 99.99% sure, 
> but I'd still like to hear actual performance numbers too, just to be 
> sure.
For sure.

> 
> Anyway, at least the "was it the BKL or some other semaphore user" 
> question is entirely off the table.
> 
> So we need to
> 
>  - fix the BKL. My patch may be a good starting point, but there are 
>    alternatives:
> 
>     (a) reinstate the old BKL code entirely
> 
> 	Quite frankly, I'd prefer not to. Not only did it have three 
> 	totally different cases, some of them were apparently broken (ie 
> 	BKL+regular preempt didn't cond_resched() right), and I just don't 
> 	think it's worth maintaining three different versions, when 
> 	distro's are going to pick one anyway. *We* should pick one, and 
> 	maintain it.
> 
>     (b) screw the special BKL preemption - it's a spinlock, we don't 
> 	preempt other spinlocks, but at least fix BKL+preempt+cond_resched
> 	thing. 
> 
> 	This would be "my patch + fixes" where at least one of the fixes 
> 	is the known (apparently old) cond_preempt() bug.
> 
>     (c) Try to keep the 2.6.25 code as closely as possible, but just 
> 	switch over to mutexes instead.
> 
> 	I dunno. I was never all that enamoured with the BKL as a sleeping 
> 	lock, so I'm biased against this one, but hey, it's just a 
> 	personal bias.
> 
>  - get rid of the BKL anyway, at least in anything that is noticeable.
> 
> 	Matthew's patch to file locking is probably worth doing as-is, 
> 	simply because I haven't heard any better solutions. The BKL 
> 	certainly can't be it, and whatever comes out of the NFSD 
> 	discussion will almost certainly involve just making sure that 
> 	those leases just use the new fs/locks.c lock.
> 
> 	This is also why I'd actually prefer the simplest possible 
> 	(non-preempting) spinlock BKL. Because it means that we can get 
> 	rid of all that "saved_lock_depth" crud (like my patch already 
> 	did). We shouldn't aim for a clever BKL, we should aim for a BKL 
> 	that nobody uses.
> 
> I'm certainly open to anything. Regardless, we should decide fairly soon, 
> so that we have the choice made before -rc2 is out, and not drag this out, 
> since regardless of the choice it needs to be tested and people comfy with 
> it for the 2.6.26 release.
> 
> 			Linus

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ