[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 03 Mar 2009 08:36:21 +0100
From: Carsten Aulbert <carsten.aulbert@....mpg.de>
To: linux-kernel@...r.kernel.org
CC: linux-nfs@...r.kernel.org
Subject: Re: kernel BUG at kernel/workqueue.c:291
Hi Andrew,
Andrew Morton schrieb:
>> in the mean time 43 of our nodes were struck with this error. It seems
>> that the jobs of a certain user can trigger this bug, however I have no
>> clue how to really trigger it manually.
>
> That's a lot of nodes.
Quite, at least some percentage of the whole system.
>
> Let's cc the NFS developers, see if this rpciod crash is familiar to them?
Good idea, I should have done that myself - sorry
I think we were able to pinpoint at least one user's jobs to "generate"
this, but I need to talk to him, what access patterns are used via NFS here.
Systems are running Debian Etch,
dpkg -l | awk '/(nfs|portmap)/ {print $2 "\t\t" $3}'
libnfsidmap2 0.18-0
mountnfs 1.1.3-2
nfs-common 1.0.10-6+etch.1
nfs-kernel-server 1.0.10-6+etch.1
portmap 5-26
If you need more, please let me know! So far the machines are 'on hold',
i.e. we have not yet rebooted them to be able to find out a little bit
more. If you(anyone) think we can reboot them and put back into our
scheduling queue, please let me know, the users are waiting for more cycles.
Thanks a lot
Carsten
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists