[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOQ4uxgoDHLnVb9=R2LpNqEFtjx=f5K8QXQnfiziBQ+jURLh=A@mail.gmail.com>
Date: Sat, 21 Dec 2019 10:43:05 +0200
From: Amir Goldstein <amir73il@...il.com>
To: "Darrick J. Wong" <darrick.wong@...cle.com>
Cc: Chris Down <chris@...isdown.name>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Al Viro <viro@...iv.linux.org.uk>,
Jeff Layton <jlayton@...nel.org>,
Johannes Weiner <hannes@...xchg.org>,
Tejun Heo <tj@...nel.org>,
linux-kernel <linux-kernel@...r.kernel.org>, kernel-team@...com
Subject: Re: [PATCH] fs: inode: Reduce volatile inode wraparound risk when
ino_t is 64 bit
On Fri, Dec 20, 2019 at 11:33 PM Darrick J. Wong
<darrick.wong@...cle.com> wrote:
>
> On Fri, Dec 20, 2019 at 02:49:36AM +0000, Chris Down wrote:
> > In Facebook production we are seeing heavy inode number wraparounds on
> > tmpfs. On affected tiers, in excess of 10% of hosts show multiple files
> > with different content and the same inode number, with some servers even
> > having as many as 150 duplicated inode numbers with differing file
> > content.
> >
> > This causes actual, tangible problems in production. For example, we
> > have complaints from those working on remote caches that their
> > application is reporting cache corruptions because it uses (device,
> > inodenum) to establish the identity of a particular cache object, but
>
> ...but you cannot delete the (dev, inum) tuple from the cache index when
> you remove a cache object??
>
> > because it's not unique any more, the application refuses to continue
> > and reports cache corruption. Even worse, sometimes applications may not
> > even detect the corruption but may continue anyway, causing phantom and
> > hard to debug behaviour.
> >
> > In general, userspace applications expect that (device, inodenum) should
> > be enough to be uniquely point to one inode, which seems fair enough.
>
> Except that it's not. (dev, inum, generation) uniquely points to an
> instance of an inode from creation to the last unlink.
>
Yes, but also:
There should not exist two live inodes on the system with the same (dev, inum)
The problem is that ino 1 may still be alive when wraparound happens
and then two different inodes with ino 1 exist on same dev.
Take the 'diff' utility for example, it will report that those files
are identical
if they have the same dev,ino,size,mtime. I suspect that 'mv' will not
let you move one over the other, assuming they are hardlinks.
generation is not even exposed to legacy application using stat(2).
Thanks,
Amir.
Powered by blists - more mailing lists