lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <ec267e95fd21891986373c7af1c72b4c8b507332.1376679411.git.luto@amacapital.net>
Date:	Fri, 16 Aug 2013 16:22:10 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	linux-kernel@...r.kernel.org
Cc:	linux-ext4@...r.kernel.org, Dave Chinner <david@...morbit.com>,
	Theodore Ts'o <tytso@....edu>,
	Dave Hansen <dave.hansen@...ux.intel.com>, xfs@....sgi.com,
	Jan Kara <jack@...e.cz>, Tim Chen <tim.c.chen@...ux.intel.com>,
	Christoph Hellwig <hch@...radead.org>,
	Andy Lutomirski <luto@...capital.net>
Subject: [PATCH v3 3/5] mm: Notify filesystems when it's time to apply a deferred cmtime update

Filesystems that defer cmtime updates should update cmtime when any
of these events happen after a write via a mapping:

 - The mapping is written back to disk.  This happens from all kinds
   of places, all of which eventually call ->writepages.

 - munmap is called or the mapping is removed when the process exits

 - msync(MS_ASYNC) is called.  Linux currently does nothing for
   msync(MS_ASYNC), but POSIX says that cmtime should be updated some
   time between an mmaped write and the subsequent msync call.
   MS_SYNC calls ->writepages, but MS_ASYNC needs special handling.

Filesystmes that defer cmtime updates should flush them on munmap or
exit.  Finding out that this happened through vm_ops is messy, so
add a new address space op for this.

It's not strictly necessary to call ->flush_cmtime after ->writepages,
but it simplifies the fs code.  As an optional optimization,
filesystems can call mapping_test_clear_cmtime themselves in
->writepages (as long as they're careful to scan all the pages first
-- the cmtime bit may not be set when ->writepages is entered).

This patch does not implement the MS_ASYNC case; that's in the next
patch.

Signed-off-by: Andy Lutomirski <luto@...capital.net>
---
 include/linux/fs.h        |  9 +++++++++
 include/linux/writeback.h |  1 +
 mm/mmap.c                 |  9 ++++++++-
 mm/page-writeback.c       | 26 ++++++++++++++++++++++++++
 4 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 86cf0a4..f224155 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -350,6 +350,15 @@ struct address_space_operations {
 	/* Write back some dirty pages from this mapping. */
 	int (*writepages)(struct address_space *, struct writeback_control *);
 
+	/*
+	 * Userspace expects certain system calls to update cmtime if
+	 * a file has been recently written using a shared vma.  In
+	 * cases where cmtime may need to be updated but writepages is
+	 * not called, this is called instead.  (Implementations
+	 * should call mapping_test_clear_cmtime.)
+	 */
+	void (*flush_cmtime)(struct address_space *);
+
 	/* Set a page dirty.  Return true if this dirtied it */
 	int (*set_page_dirty)(struct page *page);
 
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 4e198ca..f6e8261 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -174,6 +174,7 @@ typedef int (*writepage_t)(struct page *page, struct writeback_control *wbc,
 
 int generic_writepages(struct address_space *mapping,
 		       struct writeback_control *wbc);
+void generic_flush_cmtime(struct address_space *mapping);
 void tag_pages_for_writeback(struct address_space *mapping,
 			     pgoff_t start, pgoff_t end);
 int write_cache_pages(struct address_space *mapping,
diff --git a/mm/mmap.c b/mm/mmap.c
index 1edbaa3..7ed7700 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1,3 +1,4 @@
+
 /*
  * mm/mmap.c
  *
@@ -249,8 +250,14 @@ static struct vm_area_struct *remove_vma(struct vm_area_struct *vma)
 	might_sleep();
 	if (vma->vm_ops && vma->vm_ops->close)
 		vma->vm_ops->close(vma);
-	if (vma->vm_file)
+	if (vma->vm_file) {
+		if ((vma->vm_flags & VM_SHARED) && vma->vm_file->f_mapping) {
+			struct address_space *mapping = vma->vm_file->f_mapping;
+			if (mapping->a_ops && mapping->a_ops->flush_cmtime)
+				mapping->a_ops->flush_cmtime(mapping);
+		}
 		fput(vma->vm_file);
+	}
 	mpol_put(vma_policy(vma));
 	kmem_cache_free(vm_area_cachep, vma);
 	return next;
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 3f0c895..9ab8c9e 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1928,6 +1928,18 @@ int do_writepages(struct address_space *mapping, struct writeback_control *wbc)
 		ret = mapping->a_ops->writepages(mapping, wbc);
 	else
 		ret = generic_writepages(mapping, wbc);
+
+	/*
+	 * ->writepages will call clear_page_dirty_for_io, which may, in turn,
+	 * mark the mapping for deferred cmtime update.  As an optimization,
+	 * a filesystem can flush the update at the end of ->writepages
+	 * (possibly avoiding a journal transaction, for example), but,
+	 * for simplicity, let filesystems skip that part and just implement
+	 * ->flush_cmtime.
+	 */
+	if (mapping->a_ops->flush_cmtime)
+		mapping->a_ops->flush_cmtime(mapping);
+
 	return ret;
 }
 
@@ -1970,6 +1982,20 @@ int write_one_page(struct page *page, int wait)
 }
 EXPORT_SYMBOL(write_one_page);
 
+/**
+ * generic_flush_cmtime - perform a deferred cmtime update if needed
+ * @mapping: address space structure
+ *
+ * This is a library function, which implements the flush_cmtime()
+ * address_space_operation.
+ */
+void generic_flush_cmtime(struct address_space *mapping)
+{
+	if (mapping_test_clear_cmtime(mapping))
+		inode_update_time_writable(mapping->host);
+}
+EXPORT_SYMBOL(generic_flush_cmtime);
+
 /*
  * For address_spaces which do not use buffers nor write back.
  */
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ