lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <0F10A59FDFFDFD4E9BEBD7365DE6725501EC3707@uk-email.terastack.bluearc.com>
Date:	Wed, 2 Jul 2008 12:03:55 +0100
From:	"Andy Chittenden" <andyc@...earc.com>
To:	<linux-kernel@...r.kernel.org>
Subject: nfs client readdir caching issue?

Very rarely, we're seeing various problems on a linux kernel client
(seen on various versions) with ls on directories from an NFS server
that haven't changed:

* looping ls (strace -v shows getdents returning the same names over
again).
* duplicate directory entries.
* missing directory entries.

I've hunted google but can only see problems where NFS servers have
returned duplicate cookies. I've packet captured the readdirplus on one
of the directories and see no duplicate cookies. The problems remain
until the directory is touched, the NFS server is unmounted or some
other event happens (the data is flushed from the cache?).

I think we then got lucky and got two packet captures from different
clients running the same linux kernel. On these clients, the ls output
was ok - no loops, no duplicates, no missing entries. Both captures
showed two readdirplus requests returning the same entries in the same
order but the amount of data in the responses was different. One capture
showed the server returned 1724 bytes, 10 entries, last cookie of 12,
followed by the next readdirplus returning a length of 948 bytes, 5
entries, a first cookie value of 13. In the other capture, the responses
returned 2204 bytes, 13 entries, a last cookie of 17 and 468 bytes, 2
entries, a first cookie of 19.

In the past we've found that ls has returned duplicate entries on this
directory (but didn't have a capture at the time) and those duplicate
entries are the ones that are returned as the last 3 entries in the
first response of the second capture and the first 3 entries in the
second response of the first capture.

So what I think has happened in this particular case, is that at some
point in the past, the directory was read OK with packets similar to the
first capture. Next, the client decided to get rid of the first page of
cached readdir responses from memory for some reason (running low on
memory?) but kept the second page. Subsequently, the readdir cache needs
repopulating so the client sends a readdirplus specifying cookie of 0
and this time it gets a response which is similar to the first packet of
the second capture and thus we now have in cache duplicate names and
cookie values.

So is this possible? Is there some easy way to provoke it? Does this
mean the client's readdir cache is broken?

Please cc me on any response.

-- 
Andy, BlueArc Engineering


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ