linux-kernel - Re: [PATCH 17/39] union-mount: Union mounts documentation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100504211209.GC4360@shareable.org>
Date:	Tue, 4 May 2010 22:12:09 +0100
From:	Jamie Lokier <jamie@...reable.org>
To:	Valerie Aurora <vaurora@...hat.com>
Cc:	Alexander Viro <viro@...iv.linux.org.uk>,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	Christoph Hellwig <hch@...radead.org>,
	Jan Blunck <jblunck@...e.de>
Subject: Re: [PATCH 17/39] union-mount: Union mounts documentation

Valerie Aurora wrote:
> +File copyup: Create a file on the top layer that has the same metadata
> +and contents as the file with the same pathname on the bottom layer.

Can copyup be interrupted?  E.g. if I chmod an 80GB file, will the
chmod() system call pause for a couple of hours, or can I control-C it?

> +This deviation from standard is due to technical limitations of the
> +union mount implementation.  Specifically, we would need to replace an
> +open file descriptor from the lower layer with an open file descriptor
> +for a file with matching pathname and contents on the upper layer,
> +which is difficult to do.  We avoid this in other system calls by
> +doing the copyup before the file is opened.  Unionfs doesn't encounter
> +this problem because it creates a dummy file struct which redirects or
> +fans out operations to the struct files for the underlying file
> +systems.
> +
> +From an application's point of view, the result of an in-kernel file
> +copyup is the logical equivalent of another application updating the
> +file via the rename() pattern: creat() a new file, copy the data over,
> +make changes the copy, and rename() over the old version.  Any
> +existing open file descriptors for that file (including those in the
> +same application) refer to a now invisible object that used to have
> +the same pathname.  Only opens that occur after the copyup will see
> +updates to the file.

Does it apply the same permission checks that a program doing
copy+rename would have to pass?  I guess that is just write access to
the directory.

Does it effectively "rename" all hard links referring to the file, to
point to the new version, or does it only affect the path that was
used by the writer/modifier, leaving the other links continue to refer
to the original file?

> + - File copyup on open(O_DIRECT)

Why is O_DIRECT relevant?  O_DIRECT doesn't imply writing, and
copy+rename behaviour is the same with O_DIRECT as not.

Some programs use O_DIRECT to read very large files, without intending
they will ever be modified.  For example, qemu using O_DIRECT to
access a disk image backing file.

> +NFS interaction
> +===============
> +
> +NFS is currently not supported as either type of layer.  NFS as
> +read-only layer requires support from the server to honor the
> +read-only guarantee needed for the bottom layer.  To do this, the
> +server needs to revoke access to clients requesting read-only file
> +systems if the exported file system is remounted read-write or
> +unmounted (during which arbitrary changes can occur).  Some recent
> +discussion:
> +
> +http://markmail.org/message/3mkgnvo4pswxd7lp
> +
> +NFS as the read-write layer would require implementation of the
> +->whiteout() and ->fallthru() methods.  DT_WHT directory entries are
> +theoretically already supported.
> +
> +Also, technically the requirement for a readdir() cookie that is
> +stable across reboots comes only from file systems exported via NFSv2:
> +
> +http://oss.oracle.com/pipermail/btrfs-devel/2008-January/000463.html
> +
> +Todo:
> +
> +- Guarantee really really read-only on NFS exports
> +- Implement whiteout()/fallthru() for NFS

I'm finding it hard to imagine _guaranteeing_ really read-only.  All
you can guarantee is that the NFS says it is read-only.

For example, a userspace NFS server cannot prevent the filesystem it's
serving from changing.

Is this not a problem with other network filesystems like CIFS, P9, FUSE?

> +Known non-POSIX behaviors
> +-------------------------
> +
> +- Link count may be wrong for files on bottom layer with > 1 link count

Can you say a bit more about what will be seen?

> +- File copyup is the logical equivalent of an update via copy +
> +  rename().  Any existing open file descriptors will continue to refer
> +  to the read-only copy on the bottom layer and will not see any
> +  changes that occur after the copy-up.

I can imagine some database-like programs getting confused by that.

Maybe it would be better to fail copyup operations when the file is
currently open O_RDONLY by anyone, analogous to the way writable
mounts are refused when any union holds it read-only?

Are there uses likely to be broken by that behaviour?

Thanks,
-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/