[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191221180530.GJ7497@magnolia>
Date: Sat, 21 Dec 2019 10:05:30 -0800
From: "Darrick J. Wong" <darrick.wong@...cle.com>
To: Amir Goldstein <amir73il@...il.com>
Cc: Chris Down <chris@...isdown.name>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Al Viro <viro@...iv.linux.org.uk>,
Jeff Layton <jlayton@...nel.org>,
Johannes Weiner <hannes@...xchg.org>,
Tejun Heo <tj@...nel.org>,
linux-kernel <linux-kernel@...r.kernel.org>, kernel-team@...com
Subject: Re: [PATCH] fs: inode: Reduce volatile inode wraparound risk when
ino_t is 64 bit
On Sat, Dec 21, 2019 at 10:43:05AM +0200, Amir Goldstein wrote:
> On Fri, Dec 20, 2019 at 11:33 PM Darrick J. Wong
> <darrick.wong@...cle.com> wrote:
> >
> > On Fri, Dec 20, 2019 at 02:49:36AM +0000, Chris Down wrote:
> > > In Facebook production we are seeing heavy inode number wraparounds on
> > > tmpfs. On affected tiers, in excess of 10% of hosts show multiple files
> > > with different content and the same inode number, with some servers even
> > > having as many as 150 duplicated inode numbers with differing file
> > > content.
> > >
> > > This causes actual, tangible problems in production. For example, we
> > > have complaints from those working on remote caches that their
> > > application is reporting cache corruptions because it uses (device,
> > > inodenum) to establish the identity of a particular cache object, but
> >
> > ...but you cannot delete the (dev, inum) tuple from the cache index when
> > you remove a cache object??
> >
> > > because it's not unique any more, the application refuses to continue
> > > and reports cache corruption. Even worse, sometimes applications may not
> > > even detect the corruption but may continue anyway, causing phantom and
> > > hard to debug behaviour.
> > >
> > > In general, userspace applications expect that (device, inodenum) should
> > > be enough to be uniquely point to one inode, which seems fair enough.
> >
> > Except that it's not. (dev, inum, generation) uniquely points to an
> > instance of an inode from creation to the last unlink.
> >
>
> Yes, but also:
> There should not exist two live inodes on the system with the same (dev, inum)
> The problem is that ino 1 may still be alive when wraparound happens
> and then two different inodes with ino 1 exist on same dev.
*OH* that's different then. Most sane filesystems <cough>btrfs<cough>
should never have the same inode numbers for different files. Sorry for
the noise, I misunderstood what the issue was. :)
> Take the 'diff' utility for example, it will report that those files
> are identical
> if they have the same dev,ino,size,mtime. I suspect that 'mv' will not
> let you move one over the other, assuming they are hardlinks.
> generation is not even exposed to legacy application using stat(2).
Yeah, I was surprised to see it's not even in statx. :/
--D
> Thanks,
> Amir.
Powered by blists - more mailing lists