lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 28 Aug 2009 11:52:58 +0900
From:	Tejun Heo <tj@...nel.org>
To:	Steven Rostedt <rostedt@...dmis.org>
CC:	LKML <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [BUG] lockup with the latest kernel

Tejun Heo wrote:
>>> Always happens where one CPU is sending an IPI and the other has the rq 
>>> spinlock. Seems to be that the IPI expects the other CPU to not have 
>>> interrupts disabled or something?
> 
> I'm not too familiar with apics but AFAIK sending IPI isn't an
> interlocked operation (not at least at the software level) so I doubt
> it has much to do with the other cpu doing or not doing anything.  It
> looks like the local apic is stuck hardware-wise.  The only thing the
> commit changes is that cpu1 would be using vector 0xf1 instead of 0xf0
> together with cpu0.
> 
> (reading the doc...) Okay, here's something interesting.  It's from
> section 9.8.4 of intel doc 253668.pdf - Intel 64 and IA-32
> Architectures Software Developer's Manual Volume 3A: System
> Programming Guide, Part 1.
> 
>  For the P6 family and Pentium processors, the IRR and ISR registers
>  can queue no more than two interrupts per priority level, and will
>  reject other interrupts that are received within the same priority
>  level.
> 
> And from AMD's 24593 - AMD64 Architecture Programmer's Manual Volume
> 2: System Programming, section 16.6.3.
> 
>  No more than two interrupts can be pending for the same interrupt
>  vector number. Subsequent interrupt requests to the same interrupt
>  vector number will be rejected. See Figure 16-23 on page 445.
> 

Oh... there are differences that I missed.

 All intels: If more than one interrupt is generated with the same
	     vector number, the local APIC can set the bit for the
	     vector both in the IRR and the ISR. This means that for
	     the Pentium 4 and Intel Xeon processors, the IRR and ISR
	     can queue two interrupts for each interrupt vector: one
	     in the IRR and one in the ISR. Any additional interrupts
	     issued FOR THE SAME INTERRUPT VECTOR are COLLAPSED INTO
	     THE SINGLE BIT in the IRR.

 Ppro: no more than two interrupts PER PRIORITY LEVEL, and will REJECT
       OTHER interrupts

 AMD64: Subsequent interrupt requests to THE SAME INTERRUPT VECTOR
        NUMBER will be REJECTED.

Eh... don't have earlier AMD doc and gotta go now.  Can somebody
please check?  But it looks like we can deadlock by simply sending
RESCHEDULE_VECTOR more than two times while holding rq lock on AMD?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ