lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 27 Aug 2010 15:37:33 +0200
From:	Petr Tesarik <ptesarik@...e.cz>
To:	linux-ia64@...r.kernel.org
Cc:	linux-kernel@...r.kernel.org, Tony Luck <tony.luck@...el.com>,
	Hedi Berriche <hedi@....com>
Subject: Serious problem with ticket spinlocks on ia64

Hi everybody,

SGI has recently experienced failures with the new ticket spinlock
implementation. Hedi Berriche sent me a simple test case that can
trigger the failure on the siglock. To debug the issue, I wrote a
small module that watches writes to current->sighand->siglock and
records the values.

I observed that the __ticket_spin_lock() primitive fails when the
tail wraps around to zero. I reconstructed the following:

CPU 7 holds the spinlock
CPU 5 wants to acquire the spinlock
Spinlock value is 0xfffcffff
	 (now serving 0x7fffe, next ticket 0x7ffff)

CPU 7 executes st2.rel to release the spinlock.
At the same time CPU 5 executes a fetchadd4.acq.
The resulting lock value is 0xfffe0000 (correct), and CPU 5 has
recorded its ticket number (0x7fff).

Consequently, the first spinlock loop iteration succeeds, and CPU 5
now holds the spinlock.

Next, CPU 5 releases the spinlock with st2.rel, changing the lock
value to 0x0 (correct).

SO FAR SO GOOD.

Now, CPU 4, CPU 5 and CPU 7 all want to acquire the lock again.
Interestingly, CPU 5 and CPU 7 are both granted the same ticket,
and the spinlock value (as seen from the debug fault handler) is
0x0 after single-stepping over the fetchadd4.acq, in both cases.
CPU 4 correctly sets the spinlock value to 0x1.

I don't know if the simultaneos acquire attempt and release are
necessary to trigger the bug, but I noted it here.

I've only seen this happen when the spinlock wraps around to zero,
but I don't know whether it cannot happen otherwise.

In any case, there seems to be a serious problem with memory
ordering, and I'm not an expert to tell exactly what it is.

Any ideas?

Petr Tesarik
L3 International
Novell, Inc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ