lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070621203013.GA466@elte.hu>
Date:	Thu, 21 Jun 2007 22:30:13 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Eric Dumazet <dada1@...mosbay.com>,
	Chuck Ebbert <cebbert@...hat.com>,
	Jarek Poplawski <jarkao2@...pl>,
	Miklos Szeredi <miklos@...redi.hu>, chris@...ee.ca,
	linux-kernel@...r.kernel.org, tglx@...utronix.de,
	akpm@...ux-foundation.org
Subject: Re: [BUG] long freezes on thinkpad t60


* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> >          for (;;) {
> >                  for (i = 0; i < loops; i++) {
> >                          if (__raw_write_trylock(&lock->raw_lock))
> >                                  return;
> >                          __delay(1);
> >                  }
> 
> What a piece of crap. 
> 
> Anybody who ever waits for a lock by busy-looping over it is BUGGY, 
> dammit!
> 
> The only correct way to wait for a lock is:
> 
>   (a) try it *once* with an atomic r-m-w 
>   (b) loop over just _reading_ it (and something that implies a memory 
>       barrier, _not_ "__delay()". Use "cpu_relax()" or "smp_rmb()")
>   (c) rinse and repeat.

damn, i first wrote up an explanation about why that ugly __delay(1) is 
there (it almost hurts my eyes when i look at it!) but then deleted it 
as superfluous :-/

really, it's not because i'm stupid (although i might still be stupid 
for other resons ;-), it wasnt there in earlier spin-debug versions. We 
even had an inner spin_is_locked() loop at a stage (and should add it 
again).

the reason for the __delay(1) was really mundane: to be able to figure 
out when to print a 'we locked up' message to the user. If it's 1 
second, it causes false positive on some systems. If it's 10 minutes, 
people press reset before we print out any useful data. It used to be 
just a loop of rep_nop()s, but that was hard to calibrate: on certain 
newer hardware it was triggering as fast as in 2 seconds, causing many 
false positives. We cannot use jiffies nor any other clocksource in this 
debug code.

so i settled for the butt-ugly but working __delay(1) thing, to be able 
to time the debug messages.

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ