lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 18 Feb 2008 15:25:09 -0600
From:	Tom Tucker <tom@...ngridcomputing.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Michael Tokarev <mjt@....msk.ru>,
	Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-nfs@...r.kernel.org
Subject: Re: 2.6.24: RPC: bad TCP reclen 0x00020090 (large)


On Mon, 2008-02-18 at 04:58 -0800, Andrew Morton wrote:
> (suitable cc added)
> 
> (regression)
> 
> On Wed, 13 Feb 2008 17:02:53 +0300 Michael Tokarev <mjt@....msk.ru> wrote:
> 
> > Hello!
> > 
> > After upgrading to 2.6.24 (from .23), we're seeing ALOT
> > of messages like in $subj in dmesg:
> > 
> > Feb 13 13:21:39 paltus kernel: RPC: bad TCP reclen 0x00020090 (large)
> > Feb 13 13:21:46 paltus kernel: printk: 3586 messages suppressed.
> > Feb 13 13:21:46 paltus kernel: RPC: bad TCP reclen 0x00020090 (large)
> > Feb 13 13:21:49 paltus kernel: printk: 371 messages suppressed.
> > Feb 13 13:21:49 paltus kernel: RPC: bad TCP reclen 0x00020090 (large)
> > Feb 13 13:21:55 paltus kernel: printk: 2979 messages suppressed.
> > ...
> > 
> > with linux NFS server.  The clients are all linux too, mostly 2.6.23
> > and some 2.6.22.
> > 
> > I found the "offending" piece of code in net/sunrpc/svcsock.c,
> > in routine svc_tcp_recvfrom() with condition being:
> > 
> >    if (svsk->sk_reclen > serv->sv_max_mesg) ...

The problem might be that the client is setting a bit in the RPC message
length field that is meant to be interpreted and masked off by the
server -- and we're not doing it yet. My bet is that 0x20000 is the bit
we're looking for. I'll poke around...

> > 
> > This happens after a server reboot.  At this point, client(s) are trying
> > to perform some NFS transaction and fail, and server starts generating
> > the above messages - till I do a umount followed by mount on all clients.
> > Before, such situation (nfs server reboot) were handled transparently,
> > ie, there was nothing to do, the mount continued working just fine when
> > the server comes back online.
> > 
> > Now, I'm not sure if it's really 2.6.24-specific problem or a userspace
> > problem.  Some time ago we also upgraded nfs-kernel-server (Debian)
> > package, and the remount-after-nfs-server-reboot problem started to
> > occur at THAT time (and it is something to worry about as well, I just
> > had no time to deal with it); but the dmesg spamming only appeared
> > with 2.6.24.
> > 
> > How to debug the issue further on from this point?
> > 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ