[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ufa5zm9s7kz.fsf@epithumia.math.uh.edu>
Date: Tue, 03 Sep 2019 10:49:48 -0500
From: Jason L Tibbitts III <tibbs@...h.uh.edu>
To: bfields@...ldses.org (J. Bruce Fields)
Cc: linux-nfs@...r.kernel.org, km@...all.com,
linux-kernel@...r.kernel.org
Subject: Re: Regression in 5.1.20: Reading long directory fails
>>>>> "JLT" == Jason L Tibbitts <tibbs@...h.uh.edu> writes:
JLT> Certainly a server reboot, or maybe even just
JLT> unmounting and remounting the filesystem or copying the data to
JLT> another filesystem would tell me that. In any case, as soon as I
JLT> am able to mess with that server, I'll know more.
Rebooting the server did not make any difference, and now more users are
seeing the problem. At this point I'm in a state where NFS simply isn't
reliable at all, and I'm not sure what to do. If Centos 8 were out,
I'd work on moving to that just so that the server was a little more
modern. (Currently the server is Centos 7.) I guess I could try using
Fedora, or installing one of the upstream kernels, just in case this has
to do with some interaction between the client and the old RHEL7 kernel.
I do have a packet capture of a directory listing that fails with EIO,
but I'm not sure if it's safe to simply post it, and I'm not sure what
tshark options would be useful in decoding it.
I do know that I can rsync one of the problematic directories to a
different server (running the same kernel) and it doesn't have the same
problem. What I'll try next is rsyncing to a different filesystem on
the same server, but again I'll have to wait until people log off to do
proper testing.
- J<
Powered by blists - more mailing lists