linux-kernel - Re: [2.6.26-rc4] mount.nfsv4/memory poisoning issues...

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080619081420.24645bc4@tleilax.poochiereds.net>
Date:	Thu, 19 Jun 2008 08:14:20 -0400
From:	Jeff Layton <jlayton@...hat.com>
To:	"Daniel J Blueman" <daniel.blueman@...il.com>
Cc:	chucklever@...il.com, linux-nfs@...r.kernel.org,
	nfsv4@...ux-nfs.org, "Linux Kernel" <linux-kernel@...r.kernel.org>,
	"J. Bruce Fields" <bfields@...ldses.org>,
	"Trond Myklebust" <trond.myklebust@....uio.no>
Subject: Re: [2.6.26-rc4] mount.nfsv4/memory poisoning issues...

On Sun, 15 Jun 2008 19:10:27 +0100
"Daniel J Blueman" <daniel.blueman@...il.com> wrote:

> On Thu, Jun 5, 2008 at 12:43 AM, Chuck Lever <chuck.lever@...cle.com> wrote:
> > Hi Daniel-
> >
> > On Wed, Jun 4, 2008 at 7:33 PM, Daniel J Blueman
> > <daniel.blueman@...il.com> wrote:
> >> Having experienced 'mount.nfs4: internal error' when mounting nfsv4 in
> >> the past, I have a minimal test-case I sometimes run:
> >>
> >> $ while :; do mount -t nfs4 filer:/store /store; umount /store; done
> >>
> >> After ~100 iterations, I saw the 'mount.nfs4: internal error',
> >> followed by symptoms of memory corruption [1], a locking issue with
> >> the reporting [2] and another (related?) memory-corruption issue
> >> (off-by-1?) [3]. A little analysis shows memory being overwritten by
> >> (likely) a poison value, which gets complicated if it's not
> >> use-after-free...
> >>
> >> Anyone dare confirm this issue? NFSv4 server is x86-64 Ubuntu 8.04
> >> 2.6.24-18, client U8.04 2.6.26-rc4; batteries included [4].
> >
> > We have some other reports of late model kernels with memory
> > corruption issues during NFS mount.  The problem is that by the time
> > these canaries start singing, the evidence of what did the corrupting
> > is long gone.
> >
> >> I'm happy to decode addresses, test patches etc.
> >
> > If these crashes are more or less reliably reproduced, it would be
> > helpful if you could do a 'git bisect' on the client to figure out at
> > what point in the kernel revision history this problem was introduced.
> >
> > Have you seen the problem on client kernels earlier than 2.6.25?
> 
> Firstly, I had omitted that I'd booted the kernel with
> debug_objects=1, which provides the canary here.
> 
> The primary failure I see is 'mount.nfs4: internal error', and always
> after 358 umount/mount cycles (plus 1 initial mount) which gives us a
> clue; 'netstat' shows all these connections in a TIME_WAIT state, thus
> the bug relates to the inability to allocate a socket error path. I
> found that after the connection lifetime expired, you can mount again,
> which corroborates this theory.
> 
> In this case, we saw the mount() syscall result in the mount.nfsv4
> process being SEGV'd when booted with 'debug_object=1', without this
> option, we see:
> 
> # strace /sbin/mount.nfs4 x1:/ /store
> ...
> mount("x1:/", "/store", "nfs4", 0,
> "addr=192.168.0.250,clientaddr=19"...) = -1 EIO (Input/output error)
> 
> So, it's impossible to tell when the corruption was introduced, as it
> has only become detectable recently.
> 
> It's worth a look-over of the socket-allocation error path, if someone
> can check, and reproduces 100% with the 'debug_object=1' param,
> available since 2.6.26-rc1 and 359 mounts in quick succession.
> 

For some strange reason (probably something I'm doing wrong or maybe
something environmental), I've not been able to reproduce this panic on
a stock kernel. I did, however, apply the following fault injection
patch and was able to reproduce it on the second mount attempt. The 3
patch set that I posted last week definitely prevents the oops. If
you're able to confirm that it also fixes your panic it would be a
helpful data point.

The fault injection patch I'm using is attached. It just simulates
nfs4_init_client() consistently returning an error.

Cheers,
-- 
Jeff Layton <jlayton@...hat.com>

View attachment "nfs4-mount-fault-injection.patch" of type "text/x-patch" (402 bytes)