lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <87k1xgkct1.fsf@notabene.neil.brown.name>
Date:   Thu, 21 Dec 2017 13:57:46 +1100
From:   NeilBrown <neilb@...e.com>
To:     Trond Myklebust <trond.myklebust@...marydata.com>,
        Anna Schumaker <Anna.Schumaker@...app.com>
Cc:     lkml <linux-kernel@...r.kernel.org>,
        Linux NFS Mailing List <linux-nfs@...r.kernel.org>,
        linux-fsdevel@...r.kernel.org
Subject: [PATCH/RFC] NFS: add nostatflush mount option.


When an i_op->getattr() call is made on an NFS file
(typically from a 'stat' family system call), NFS
will first flush any dirty data to the server.

This ensures that the mtime reported is correct and stable,
but has a performance penalty.  'stat' is normally thought
to be a quick operation, and imposing this cost can be
surprising.

I have seen problems when one process is writing a large
file and another process performs "ls -l" on the containing
directory and is blocked for as long as it take to flush
all the dirty data to the server, which can be minutes.

I have also seen a legacy application which frequently calls
"fstat" on a file that it is writing to.  On a local
filesystem (and in the Solaris implementation of NFS) this
fstat call is cheap.  On Linux/NFS, the causes a noticeable
decrease in throughput.

The only circumstances where an application calling 'stat()'
might get an mtime which is not stable are times when some
other process is writing to the file and the two processes
are not using locking to ensure consistency, or when the one
process is both writing and stating.  In neither of these
cases is it reasonable to expect the mtime to be stable.

In the most common cases where mtime is important
(e.g. make), no other process has the file open, so there
will be no dirty data and the mtime will be stable.

Rather than unilaterally changing this behavior of 'stat',
this patch adds a "nosyncflush" mount option to allow
sysadmins to have applications which are hurt by the current
behavior to disable it.

Note that this option should probably *not* be used together
with "nocto".  In that case, mtime could be unstable even
when no process has the file open.

Signed-off-by: NeilBrown <neilb@...e.com>
---
 fs/nfs/inode.c                 |  3 ++-
 fs/nfs/super.c                 | 10 ++++++++++
 include/uapi/linux/nfs_mount.h |  6 ++++--
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index b992d2382ffa..16629a34dd62 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -740,7 +740,8 @@ int nfs_getattr(const struct path *path, struct kstat *stat,
 
 	trace_nfs_getattr_enter(inode);
 	/* Flush out writes to the server in order to update c/mtime.  */
-	if (S_ISREG(inode->i_mode)) {
+	if (S_ISREG(inode->i_mode) &&
+	    !(NFS_SERVER(inode)->flags & NFS_MOUNT_NOSTATFLUSH)) {
 		err = filemap_write_and_wait(inode->i_mapping);
 		if (err)
 			goto out;
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 29bacdc56f6a..2351c0be98f5 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -90,6 +90,7 @@ enum {
 	Opt_resvport, Opt_noresvport,
 	Opt_fscache, Opt_nofscache,
 	Opt_migration, Opt_nomigration,
+	Opt_statflush, Opt_nostatflush,
 
 	/* Mount options that take integer arguments */
 	Opt_port,
@@ -151,6 +152,8 @@ static const match_table_t nfs_mount_option_tokens = {
 	{ Opt_nofscache, "nofsc" },
 	{ Opt_migration, "migration" },
 	{ Opt_nomigration, "nomigration" },
+	{ Opt_statflush, "statflush" },
+	{ Opt_nostatflush, "nostatflush" },
 
 	{ Opt_port, "port=%s" },
 	{ Opt_rsize, "rsize=%s" },
@@ -637,6 +640,7 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss,
 		{ NFS_MOUNT_NORDIRPLUS, ",nordirplus", "" },
 		{ NFS_MOUNT_UNSHARED, ",nosharecache", "" },
 		{ NFS_MOUNT_NORESVPORT, ",noresvport", "" },
+		{ NFS_MOUNT_NOSTATFLUSH, ",nostatflush", "" },
 		{ 0, NULL, NULL }
 	};
 	const struct proc_nfs_info *nfs_infop;
@@ -1334,6 +1338,12 @@ static int nfs_parse_mount_options(char *raw,
 		case Opt_nomigration:
 			mnt->options &= ~NFS_OPTION_MIGRATION;
 			break;
+		case Opt_statflush:
+			mnt->flags &= ~NFS_MOUNT_NOSTATFLUSH;
+			break;
+		case Opt_nostatflush:
+			mnt->flags |= NFS_MOUNT_NOSTATFLUSH;
+			break;
 
 		/*
 		 * options that take numeric values
diff --git a/include/uapi/linux/nfs_mount.h b/include/uapi/linux/nfs_mount.h
index e44e00616ab5..d7c6f809d25d 100644
--- a/include/uapi/linux/nfs_mount.h
+++ b/include/uapi/linux/nfs_mount.h
@@ -72,7 +72,9 @@ struct nfs_mount_data {
 #define NFS_MOUNT_NORESVPORT		0x40000
 #define NFS_MOUNT_LEGACY_INTERFACE	0x80000
 
-#define NFS_MOUNT_LOCAL_FLOCK	0x100000
-#define NFS_MOUNT_LOCAL_FCNTL	0x200000
+#define NFS_MOUNT_LOCAL_FLOCK		0x100000
+#define NFS_MOUNT_LOCAL_FCNTL		0x200000
+
+#define NFS_MOUNT_NOSTATFLUSH		0x400000
 
 #endif
-- 
2.14.0.rc0.dirty


Download attachment "signature.asc" of type "application/pgp-signature" (833 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ