lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6278d2220908170653s45df9989t9fa550f7efa0c182@mail.gmail.com>
Date:	Mon, 17 Aug 2009 14:53:08 +0100
From:	Daniel J Blueman <daniel.blueman@...il.com>
To:	Trond Myklebust <Trond.Myklebust@...app.com>
Cc:	linux-nfs@...r.kernel.org, Chuck Lever <chuck.lever@...cle.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: [2.6.31-rc5] oops: NFS4 client manager kthread...

Hi Trond,

On Mon, Aug 17, 2009 at 2:12 PM, Trond
Myklebust<Trond.Myklebust@...app.com> wrote:
> On Sun, 2009-08-16 at 23:40 +0100, Daniel J Blueman wrote:
>> After losing and regaining ethernet link a few times with 2.6.31-rc5
>> [1], I've hit an oops in the NFS4 client manager kthread [2] on my
>> client with NFS4 homedir mount.
>>
>> Do you have a frequent test-case for when the client's manager kthread
>> gets invoked (with and without succeeding callbacks, due to eg a
>> firewall)? Server here is unpatched 2.6.30-rc6; I recall seeing
>> problems when the manager kthread gets invoked, across quite a few
>> kernel releases, just wasn't lucky enough to catch an oops.
>>
>> Oppsing in allow_signal() suggests task state corruption perhaps? I'm
>> downloading the debug kernel to match up the disassembly and line
>> numbers, if that helps? This time, the client had no firewall (but
>> have seen other issues when the callback has failed due to the
>> firewall).
>
> Those aren't Oopses. They are 'soft lockup' warnings. Basically, they're
> saying that the CPU is getting stuck waiting for a spin lock or a mutex.
>
> In this case, it is probably the fact that the state manager is going
> nuts trying to recover, while the connection to the server keeps coming
> up and going down.
>
> What does 'netstat -t' say when you get into this situation?

Whoops; it's true the stack-trace comes from the soft-lockup detector.

There was a single 200s link excursion, but the client didn't recover
as locks are held and never released it seems; I observe the
'192.168.1.250-m' NFS4 manager kthread being created and not going
away, despite IP connectivity with the server being fine after.

I'll reproduce it with stock 2.6.31-rc6 on the client and get 'netstat
-t' output.

Thanks for looking at this!
  Daniel

> Cheers
>  Trond
>
> --
> Trond Myklebust
> Linux NFS client maintainer
>
> NetApp
> Trond.Myklebust@...app.com
> www.netapp.com
>



-- 
Daniel J Blueman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ