linux-kernel - Re: Serious problem with ticket spinlocks on ia64

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <201009031635.25093.ptesarik@suse.cz>
Date:	Fri, 3 Sep 2010 16:35:23 +0200
From:	Petr Tesarik <ptesarik@...e.cz>
To:	Tony Luck <tony.luck@...el.com>
Cc:	"linux-ia64@...r.kernel.org" <linux-ia64@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: Serious problem with ticket spinlocks on ia64

On Friday 03 of September 2010 11:04:37 Petr Tesarik wrote:
> [...]
> I'm now trying to modify the lock primitives:
>
> 1. replace the fetchadd4.acq with looping over cmpxchg

I did this and I feel dumber than ever. Basically, I replaced this snippet:

	ticket = ia64_fetchadd(1, p, acq);

with:

	int	tmp;
	do {
		ticket = ACCESS_ONCE(lock->lock);
		asm volatile (
			"mov ar.ccv=%1\n"
			"add %0=1,%1;;\n"
			"cmpxchg4.acq %0=[%2],%0,ar.ccv\n"
			: "=r" (tmp)
			: "r" (ticket), "r" (&lock->lock)
			: "ar.ccv");
	} while (tmp != ticket);

Just to make sure I didn't miss something, this compiled to:

0xa0000001008dacb0: [MMI]       nop.m 0x0
0xa0000001008dacb1:             ld4.acq r15=[r32]
0xa0000001008dacb2:             nop.i 0x0;;
0xa0000001008dacc0: [MII]       mov.m ar.ccv=r15
0xa0000001008dacc1:             adds r14=1,r15;;
0xa0000001008dacc2:             nop.i 0x0
0xa0000001008dacd0: [MII]       cmpxchg4.acq r14=[r32],r14,ar.ccv
0xa0000001008dacd1:             nop.i 0x0
0xa0000001008dacd2:             nop.i 0x0;;
0xa0000001008dace0: [MIB]       nop.m 0x0
0xa0000001008dace1:             cmp4.eq p7,p6=r14,r15
0xa0000001008dace2:       (p06) br.cond.dptk.few 0xa0000001008dacb0

My test module recorded the following sequence on the failing CPU:

  }, {
    ip = 0xa00000010012f7b0,
    addr = 0xe000000181925c08,
    oldvalue = 0xffff0000,
    newvalue = 0x0,
    task = 0xe000000186930000
  }, {
    ip = 0xa0000001008dacd0,
    addr = 0xe000000181925c08,
    oldvalue = 0x0,
    newvalue = 0x0,
    task = 0xe000000186930000
  }, {
    ip = 0xa0000001008dacd0,
    addr = 0xe000000181925c08,
    oldvalue = 0x1,
    newvalue = 0x0,
    task = 0xe000000186930000
  }, {
    ip = 0xa0000001008dacd0,
    addr = 0xe000000181925c08,
    oldvalue = 0x1,
    newvalue = 0x0,
    task = 0xe000000186930000
  }, {
    ip = 0xa0000001008dacd0,
    addr = 0xe000000181925c08,
    oldvalue = 0x0,
    newvalue = 0x0,
    task = 0xe000000186930000
  }, {
    ip = 0xa0000001008dacd0,
    addr = 0xe000000181925c08,
    oldvalue = 0x1,
    newvalue = 0x1,
    task = 0xe000000186930000
  }, {

I didn't see values around zero on any other CPU in the system. So, either 
there is something seriously broken in hardware, or I made a silly mistake in 
the monitoring code.

I'm attaching my SystemTap script. I know it's hacky, but it worked for me.

Oh, I had to make two modification to the running kernel:

1. in ia64_fault()
By default the value of cr.ifa is not passed to the die notifiers, so I 
(mis)used the ar_ssd field to store the ifa before calling notify_die() for 
the debug faults.

2. in ivt.S
On all interrupt entries I added code similar to this (just using different 
registers if appropriate):

	movl r3 = (1 << 24)
	mov r15 = psr
	;;
	or  r3 = r3,r15
	;;
	mov psr.l = r3
	;;
	srlz.d
	;;

Am I blind and did I do something obviously wrong?

Petr Tesarik

View attachment "watchlock.stp" of type "text/x-csrc" (7511 bytes)