lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 03 Jan 2007 17:02:03 -0800
From:	Ben Greear <greearb@...delatech.com>
To:	Herbert Xu <herbert@...dor.apana.org.au>
CC:	David Stevens <dlstevens@...ibm.com>, jarkao2@...pl,
	netdev@...r.kernel.org
Subject: Re: BUG: soft lockup detected on CPU#0!  (2.6.18.2 plus hacks)

Herbert Xu wrote:
> David Stevens <dlstevens@...ibm.com> wrote:
>> Ben,
>>        Here's a patch that I think will fix it, assuming the receive is 
>> on the
>> same device as the initialization. Can you try this out?
> 
> Hi David:
> 
> Your patch makes sense on its own but I don't see the direct connection
> to the soft lock-up.  Sure it prevents the code path in question from
> triggering.  However, if we don't understand why it's locking up in the
> first place then this may just be hiding it rather than fixing it.
> 
> In particular, a soft lockup means that we're doing so much work in
> the softirq handlers that processes are not getting run.  So what is
> it exactly here that's causing us to get stuck in the softirq handlers?
> Is it because we're somehow getting stuck in a net rx loop?

I'm not sure if it helps..but I did notice that 'ip' was using 99% of the
CPU on the system.  Could this be because it was spinning trying to acquire
the read-lock?  When I ran 'ifconfig -a', that process hung, and at that point
the system was rebooted.  Before I ran ifconfig, 'top' and 'ls' and similar
apps were responding fine, and I was logged in over ssh from the US to Australia, so
it's basic networking was functioning.

What if the race is that the read-lock is only half initialized, so that
it doesn't trigger the uninitialized-lock-use debug message, but still screws
up and will not ever let the reader acquire the lock?

Thanks,
Ben

> 
> Cheers,


-- 
Ben Greear <greearb@...delatech.com>
Candela Technologies Inc  http://www.candelatech.com

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists