lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <0A24B45A-9761-4310-B1DB-B4738964E862@oracle.com>
Date:	Fri, 5 Sep 2008 15:56:46 -0400
From:	Chuck Lever <chuck.lever@...cle.com>
To:	Aaron Straus <aaron@...finllc.com>
Cc:	Neil Brown <neilb@...e.de>,
	Linux NFS Mailing List <linux-nfs@...r.kernel.org>,
	Trond Myklebust <trond.myklebust@....uio.no>,
	LKML Kernel <linux-kernel@...r.kernel.org>
Subject: Re: [NFS] blocks of zeros (NULLs) in NFS files in kernels >= 2.6.20

[ replacing cc: nfs@...net with linux-nfs@...r.kernel.org, and neil's  
old address with his current one ]

On Sep 5, 2008, at Sep 5, 2008, 3:19 PM, Aaron Straus wrote:
> Hi all,
>
>  We're hitting some bad behavior in NFS v3.  The situation is this:
>
>   machine A - NFS server
>
>   machine B - NFS client (writer)
>   machine C - NFS client (reader)
>
>   (all machines x86 SMP)
>
>  machine A exports a directory on ext3 filesystem:
>
> 	/srv/home       192.168.0.0/24(rw,sync,no_subtree_check)
>
>  machines B and C mount that directory normally
>
>        mount A:/srv/home /mntpnt
>
>  machine B opens a file and writes to it (think a log file)
>
>  machine C stats that file, opens it and reads it (think tailing the
>                                                    log file)
>
>
>  The issue is that machine C will often see large blocks of NULLs
> (zeros) in the file.  If you do the same read again just after you see
> the block of NULLs you will see proper the data.
>
>  Attached are two simple python programs that demonstrate the problem.
>
>  To use them (they will write to a file called test-nfs in CWD):
>
> (on machine B in one window)
>
>   python writer.py
>
> (on machine C in another window)
>
>   python reader.py
>
>
>  reader.py will die when it sees NULLs in the file.  Usually for us
> this happens after about 60s (two timeouts I think).   The first  
> NULL is
> usually either at index 4000 or 8000 depending on the kernel.
>
>
>  Now the version of the kernel the server is running doesn't seem to
> matter.  The reader also doesn't seem to matter (though I didn't test
> this completely).  The writer seems to be the issue:
>
>  Writer_Version     Outcome:
>  <= 2.6.19          OK
>  >= 2.6.20	    BAD

Up to which kernel?  Recent ones may address this issue already.

>  I've tested both vanilla kernel.org kernels and Ubuntu 8.04 kernels.
>
>  I can try to bisect between 2.6.19 <-> 2.6.20.

That's a good start.

Comparing a wire trace with strace output, starting with the writing  
client, might also be illuminating.  We prefer wireshark as it uses  
good default trace settings, parses the wire bytes and displays them  
coherently, and allows you to sort the frames in various useful ways.

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ