lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 13 Jan 2010 13:43:45 -0500
From:	Michael Breuer <mbreuer@...jas.com>
To:	paulmck@...ux.vnet.ibm.com
Cc:	linux-kernel@...r.kernel.org
Subject: 2.6.33rc4 RCU hang mm spin_lock deadlock(?) after running libvirtd -
 reproducible.

[Originally posted as: "Re: 2.6.33RC3 libvirtd ->sky2 & rcu oops (was 
Sky2 oops - Driver    tries to sync DMA memory it has not allocated)"]

On 1/11/2010 8:49 PM, Paul E. McKenney wrote:
> On Sun, Jan 10, 2010 at 03:10:03PM -0500, Michael Breuer wrote:
>    
>> On 1/9/2010 5:21 PM, Michael Breuer wrote:
>>      
>>> Hi,
>>>
>>> Attempting to move back to mainline after my recent 2.6.32 issues...
>>> Config is make oldconfig from working 2.6.32 config. Patch for af_packet.c
>>> (for skb issue found in 2.6.32) included. Attaching .config and NMI
>>> backtraces.
>>>
>>> System becomes unusable after bringing up the network:
>>>
>>> ...
> RCU stall warnings are usually due to an infinite loop somewhere in the
> kernel.  If you are running !CONFIG_PREEMPT, then any infinite loop not
> containing some call to schedule will get you a stall warning.  If you
> are running CONFIG_PREEMPT, then the infinite loop is in some section of
> code with preemption disabled (or irqs disabled).
>
> The stall-warning dump will normally finger one or more of the CPUs.
> Since you are getting repeated warnings, look at the stacks and see
> which of the most-recently-called functions stays the same in successive
> stack traces.  This information should help you finger the infinite (or
> longer than average) loop.
> ...
>    
I can now recreate this simply by "service start libvirtd" on an F12 
box. My earlier report that suggested this had something to do with the 
sky2 driver was incorrect. Interestingly, it's always CPU1 whenever I 
start libvirtd.
Attaching two of the traces (I've got about ten, but they're all pretty 
much the same). Looks pretty consistent - libvirtd in CPU1 is hung 
forking. Not sure why yet - perhaps someone who knows this better than I 
can jump in.
Summary of hang appears to be libvirtd forks - two threads show with 
same pid deadlocked on a spin_lock
> Then if looking at the stack traces doesn't locate the offending loop,
> bisection might help.
>    
It would, however it's going to be really difficult as I wasn't able to 
get this far with rc1 & rc2 :(
> 							Thanx, Paul
>
>    
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@...r.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>      


View attachment "stall1" of type "text/plain" (34802 bytes)

View attachment "stall2" of type "text/plain" (35630 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ