lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190527172655.9287-1-amir73il@gmail.com>
Date:   Mon, 27 May 2019 20:26:55 +0300
From:   Amir Goldstein <amir73il@...il.com>
To:     Theodore Tso <tytso@....edu>, Jan Kara <jack@...e.cz>
Cc:     "Darrick J . Wong" <darrick.wong@...cle.com>,
        Dave Chinner <david@...morbit.com>, Chris Mason <clm@...com>,
        Al Viro <viro@...iv.linux.org.uk>,
        linux-fsdevel@...r.kernel.org, linux-xfs@...r.kernel.org,
        linux-ext4@...r.kernel.org, linux-btrfs@...r.kernel.org,
        linux-api@...r.kernel.org
Subject: [RFC][PATCH] link.2: AT_ATOMIC_DATA and AT_ATOMIC_METADATA

New link flags to request "atomic" link.

Signed-off-by: Amir Goldstein <amir73il@...il.com>
---

Hi Guys,

Following our discussions on LSF/MM and beyond [1][2], here is
an RFC documentation patch.

Ted, I know we discussed limiting the API for linking an O_TMPFILE
to avert the hardlinks issue, but I decided it would be better to
document the hardlinks non-guaranty instead. This will allow me to
replicate the same semantics and documentation to renameat(2).
Let me know how that works out for you.

I also decided to try out two separate flags for data and metadata.
I do not find any of those flags very useful without the other, but
documenting them seprately was easier, because of the fsync/fdatasync
reference.  In the end, we are trying to solve a social engineering
problem, so this is the least confusing way I could think of to describe
the new API.

First implementation of AT_ATOMIC_METADATA is expected to be
noop for xfs/ext4 and probably fsync for btrfs.

First implementation of AT_ATOMIC_DATA is expected to be
filemap_write_and_wait() for xfs/ext4 and probably fdatasync for btrfs.

Thoughts?

Amir.

[1] https://lwn.net/Articles/789038/
[2] https://lore.kernel.org/linux-fsdevel/CAOQ4uxjZm6E2TmCv8JOyQr7f-2VB0uFRy7XEp8HBHQmMdQg+6w@mail.gmail.com/

 man2/link.2 | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/man2/link.2 b/man2/link.2
index 649ba00c7..15c24703e 100644
--- a/man2/link.2
+++ b/man2/link.2
@@ -184,6 +184,57 @@ See
 .BR openat (2)
 for an explanation of the need for
 .BR linkat ().
+.TP
+.BR AT_ATOMIC_METADATA " (since Linux 5.x)"
+By default, a link operation followed by a system crash, may result in the
+new file name being linked with old inode metadata, such as out dated time
+stamps or missing extended attributes.
+One way to prevent this is to call
+.BR fsync (2)
+before linking the inode, but that involves flushing of volatile disk caches.
+
+A filesystem that accepts this flag will guaranty, that old inode metadata
+will not be exposed in the new linked name.
+Some filesystems may internally perform
+.BR fsync (2)
+before linking the inode to provide this guaranty,
+but often, filesystems will have a more efficient method to provide this
+guaranty without flushing volatile disk caches.
+
+A filesystem that accepts this flag does
+.BR NOT
+guaranty that the new file name will exist after a system crash, nor that the
+current inode metadata is persisted to disk.
+Specifically, if a file has hardlinks, the existance of the linked name after
+a system crash does
+.BR NOT
+guaranty that any of the other file names exist, nor that the last observed
+value of
+.I st_nlink
+(see
+.BR stat (2))
+has persisted.
+.TP
+.BR AT_ATOMIC_DATA " (since Linux 5.x)"
+By default, a link operation followed by a system crash, may result in the
+new file name being linked with old data or missing data.
+One way to prevent this is to call
+.BR fdatasync (2)
+before linking the inode, but that involves flushing of volatile disk caches.
+
+A filesystem that accepts this flag will guaranty, that old data
+will not be exposed in the new linked name.
+Some filesystems may internally perform
+.BR fsync (2)
+before linking the inode to provide this guaranty,
+but often, filesystems will have a more efficient method to provide this
+guaranty without flushing volatile disk caches.
+
+A filesystem that accepts this flag does
+.BR NOT
+guaranty that the new file name will exist after a system crash, nor that the
+current inode data is persisted to disk.
+.TP
 .SH RETURN VALUE
 On success, zero is returned.
 On error, \-1 is returned, and
-- 
2.17.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ