lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ufapnkdw3s3.fsf@epithumia.math.uh.edu>
Date:   Fri, 06 Sep 2019 15:47:24 -0500
From:   Jason L Tibbitts III <tibbs@...h.uh.edu>
To:     "J. Bruce Fields" <bfields@...ldses.org>
Cc:     Wolfgang Walter <linux@...m.de>, linux-nfs@...r.kernel.org,
        km@...all.com, linux-kernel@...r.kernel.org
Subject: Re: Regression in 5.1.20: Reading long directory fails

>>>>> "JBF" == J Bruce Fields <bfields@...ldses.org> writes:

JBF> Those readdir changes were client-side, right?  Based on that I'd
JBF> been assuming a client bug, but maybe it'd be worth getting a full
JBF> packet capture of the readdir reply to make sure it's legit.

I have been working with bcodding on IRC for the past couple of days on
this.  Fortunately I was able to come up with way to fill up a directory
in such a way that it will fail with certainty and as a bonus doesn't
include any user data so I can feel OK about sharing packet captures.  I
have a capture alongside a kernel trace of the problematic operation in
https://www.math.uh.edu/~tibbs/nfs/.  Not that I can particularly tell
anything useful from that, but bcodding says that it seems to point to
some issue in sunrpc.

And because I can easily reproduce this and I was able to do a bisect:

2c94b8eca1a26cd46010d6e73a23da5f2e93a19d is the first bad commit
commit 2c94b8eca1a26cd46010d6e73a23da5f2e93a19d
Author: Chuck Lever <chuck.lever@...cle.com>
Date:   Mon Feb 11 11:25:41 2019 -0500

    SUNRPC: Use au_rslack when computing reply buffer size

    au_rslack is significantly smaller than (au_cslack << 2). Using
    that value results in smaller receive buffers. In some cases this
    eliminates an extra segment in Reply chunks (RPC/RDMA).

    Signed-off-by: Chuck Lever <chuck.lever@...cle.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@...app.com>

:040000 040000 d4d1ce2fbe0035c5bd9df976b8c448df85dcb505 7011a792dfe72ff9cd70d66e45d353f3d7817e3e M      net

But of course, I can't say whether this is the actual bad commit or
whether it just introduced a behavior change which alters the conditions
under which the problem appears.

And just to make sure that the blame doesn't lie with the old RHEL7
kernel, I rsynced over the problematic directory to a machine running
something slightly more modern (5.1.11, which I know I need to update,
but it's already set up to do kerberised NFS) and the same problem
exists, though the directory listing does fail at a different place.

 - J<

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ