linux-kernel - Re: AIM7 40% regression with 2.6.26-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 7 May 2008 19:49:00 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Matthew Wilcox <matthew@....cx>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"J. Bruce Fields" <bfields@...i.umich.edu>,
	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Alexander Viro <viro@....linux.org.uk>,
	linux-fsdevel@...r.kernel.org
Subject: Re: AIM7 40% regression with 2.6.26-rc1

* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> > There's far more normal mutex fastpath use during an AIM7 run than 
> > any BKL use. So if it's due to any direct fastpath overhead and the 
> > resulting widening of the window for the real slowdown, we should 
> > see a severe slowdown on AIM7 with CONFIG_MUTEX_DEBUG=y. Agreed?
> 
> Not agreed.
> 
> The BKL is special because it is a *single* lock.

ok, indeed my suggestion is wrong and this would not be a good 
comparison.

another idea: my trial-baloon patch should test your theory too, because 
the generic down_trylock() is still the 'fat' version, it does:

        spin_lock_irqsave(&sem->lock, flags);
        count = sem->count - 1;
        if (likely(count >= 0))
                sem->count = count;
        spin_unlock_irqrestore(&sem->lock, flags);

if there is a noticeable performance difference between your 
trial-ballon patch and mine, then the micro-cost of the BKL very much 
matters to this workload. Agreed about that?

but i'd be _hugely_ surprised about it. The tty code's BKL use should i 
think only happen when a task exits and releases the tty - and a task 
exit - even if this is a threaded test (which AIM7 can be - not sure 
which exact parameters Yanmin used) - the costs of thread creation and 
thread exit are just not in the same ballpark as any BKL micro-costs. 
Dunno, maybe i overlooked some high-freq BKL user. (but any such site 
would have shown up before) Even assuming a widening of the critical 
path and some catastrophic domino effect (that does show up as increased 
scheduling) i've never seen a 40% drop like this.

this regression, to me, has "different scheduling behavior" written all 
over it - but that's just an impression. I'm not going to bet against 
you though ;-)

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/