linux-kernel - Re: Finding hardlinks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20061221185850.GA16807@delft.aura.cs.cmu.edu>
Date:	Thu, 21 Dec 2006 13:58:50 -0500
From:	Jan Harkes <jaharkes@...cmu.edu>
To:	Miklos Szeredi <miklos@...redi.hu>
Cc:	mikulas@...ax.karlin.mff.cuni.cz, linux-kernel@...r.kernel.org,
	linux-fsdevel@...r.kernel.org
Subject: Re: Finding hardlinks

On Wed, Dec 20, 2006 at 12:44:42PM +0100, Miklos Szeredi wrote:
> The stat64.st_ino field is 64bit, so AFAICS you'd only need to extend
> the kstat.ino field to 64bit and fix those filesystems to fill in
> kstat correctly.

Coda actually uses 128-bit file identifiers internally, so 64-bits
really doesn't cut it. Since the 128-bit space is used pretty sparsely
there is a hash which avoids most collistions in 32-bit i_ino space, but
not completely. I can also imagine that at some point someone wants to
implement a git-based filesystem where it would be more natural to use
160-bit SHA1 hashes as unique object identifiers.

But Coda only allow hardlinks within a single directory and if someone
renames a hardlinked file and one of the names ends up in a different
directory we implicitly create a copy of the object. This actually
leverages off of the way we handle volume snapshots and the fact that we
use whole file caching and writes, so we only copy the metadata while
the data is 'copy-on-write'.

I'm considering changing the way we handle hardlinks by having link(2)
always create a new object with copy-on-write semantics (i.e. replacing
link with some sort of a copyfile operation). This way we can get rid of
several special cases like the cross-directory rename. It also avoids
problems when the various replicas of an object are found to be
inconsistent and we allow the user to expand the file. On expansion a
file becomes a directory that contains all the objects on individual
replicas. Handling the expansion in a dcache friendly way is nasty
enough as is and complicated by the fact that we really don't want such
an expansion to result in hard-linked directories, so we are forced to
inventing new unique object identifiers, etc. Again, not having
hardlinks would simplify things somewhat here.

Any application that tries to be smart enough to keep track of which
files are hardlinked should (in my opinion) also have a way to disable
this behaviour.

Jan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/