[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <E1H01Og-0007TF-00@dorka.pomaz.szeredi.hu>
Date: Thu, 28 Dec 2006 20:58:38 +0100
From: Miklos Szeredi <miklos@...redi.hu>
To: bhalevy@...asas.com
CC: arjan@...radead.org, mikulas@...ax.karlin.mff.cuni.cz,
jaharkes@...cmu.edu, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org, nfsv4@...f.org
Subject: Re: Finding hardlinks
> >> It seems like the posix idea of unique <st_dev, st_ino> doesn't
> >> hold water for modern file systems
> >
> > are you really sure?
>
> Well Jan's example was of Coda that uses 128-bit internal file ids.
>
> > and if so, why don't we fix *THAT* instead
>
> Hmm, sometimes you can't fix the world, especially if the filesystem
> is exported over NFS and has a problem with fitting its file IDs uniquely
> into a 64-bit identifier.
Note, it's pretty easy to fit _anything_ into a 64-bit identifier with
the use of a good hash function. The chance of an accidental
collision is infinitesimally small. For a set of
100 files: 0.00000000000003%
1,000,000 files: 0.000003%
And usually (tar, diff, cp -a, etc.) work with a very limited set of
st_ino's. An app that would store a million st_ino values and compare
each new to all the existing ones would be having severe performance
problems and yet _almost never_ come across a false positive.
Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists