linux-kernel - Re: [BUG] long freezes on thinkpad t60

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 21 Jun 2007 09:30:31 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Jarek Poplawski <jarkao2@...pl>,
	Miklos Szeredi <miklos@...redi.hu>, cebbert@...hat.com,
	chris@...ee.ca, linux-kernel@...r.kernel.org, tglx@...utronix.de,
	akpm@...ux-foundation.org
Subject: Re: [BUG] long freezes on thinkpad t60

* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> In other words, spinlocks are optimized for *lack* of contention. If a 
> spinlock has contention, you don't try to make the spinlock "fair". 
> No, you try to fix the contention instead!

yeah, and if there's no easy solution, change it to a mutex. Fastpath 
performance of spinlocks and mutexes is essentially the same, and if 
there's any measurable contention then the scheduler is pretty good at 
sorting things out. Say if the average contention is longer than 10-20 
microseconds then likely we could already win by scheduling away to some 
other task. (the best is of course to have no contention at all - but 
there are causes where it is real hard, and there are cases where it's 
outright unmaintainable.)

Hw makers are currently producing transistors disproportionatly faster 
than humans are producing parallel code, as a result of which we've got 
more CPU cache than ever, even taking natural application bloat into 
account. (it just makes no sense to spend those transistors on 
parallelism when applications are just not making use of it yet. Plus 
caches are a lot less power intense than functional units of the CPU, 
and the limit these days is power input.)

So scheduling more frequently and more agressively makes more sense than 
ever before and that trend will likely not stop for some time to come. 

> The patch I sent out was an example of that. You *can* fix contention 
> problems. Does it take clever approaches? Yes. It's why we have hashed 
> spinlocks, RCU, and code sequences that are entirely lockless and use 
> optimistic approaches. And suddenly you get fairness *and* 
> performance!

what worries me a bit though is that my patch that made spinlocks 
equally agressive to that loop didnt solve the hangs! So there is some 
issue we dont understand yet - why was the wait_inactive_task() 
open-coded spin-trylock loop starving the other core which had ... an 
open-coded spin-trylock loop coded up in assembly? And we've got a 
handful of other open-coded loops in the kernel (networking for example) 
so this issue could come back and haunt us in a situation where we dont 
have a gifted hacker like Miklos being able to spend _weeks_ to track 
down the problem...

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/