[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20191122235324.17245-1-agruenba@redhat.com>
Date: Sat, 23 Nov 2019 00:53:21 +0100
From: Andreas Gruenbacher <agruenba@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Steven Whitehouse <swhiteho@...hat.com>,
Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
"Kirill A. Shutemov" <kirill@...temov.name>, linux-mm@...ck.org,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
Alexander Viro <viro@...iv.linux.org.uk>,
Johannes Weiner <hannes@...xchg.org>, cluster-devel@...hat.com,
Ronnie Sahlberg <lsahlber@...hat.com>,
Steve French <sfrench@...ba.org>,
Bob Peterson <rpeterso@...hat.com>,
Andreas Gruenbacher <agruenba@...hat.com>
Subject: [RFC PATCH 0/3] Rework the gfs2 read and page fault locking
Hello,
this patch series moves the glock lock taking in gfs2 from the
->readpage and ->readpages inode operations to the ->read_iter file and
->fault vm operations. To achieve that, we add flags to the
generic_file_read_iter and filemap_fault generic helpers.
This proposal was triggered by the following discussion:
https://lore.kernel.org/linux-fsdevel/157225677483.3442.4227193290486305330.stgit@buzz/
In that thread, Linus argued that filesystems should make sure the inode
size is sufficiently up-to-date before calling the generic helpers, and
that filesystems can do it themselves if they want more than that.
That's surely doable. However, implementing those operations properly
at the filesystem level quickly becomes complicated when it gets to
things like readahead. In addition, those slightly modified copies of
those helpers would surely diverge from their originals over time, and
maintaining them properly would become hard. So I hope the relatively
small changes to make the original helpers slightly more flexible will
be acceptable instead.
With the IOCB_CACHED flag added by one of the patches in this series,
the code that Konstantin's initial patch adds to
generic_file_buffered_read could be made conditional on the IOCB_CACHED
flag being cleared. That way, it won't misfire on filesystems that
allow a stale inode size. (I'm not sure if any filesystems other than
gfs2 are actually affected.)
Some additional explanation:
The cache consistency model of filesystems like gfs2 is such that if
pages are found in an inode's address space, those pages as well as the
inode size are up to date and can be used without taking any filesystem
locks. If a page is not cached, filesystem locks must be taken before
the page can be read; this will also bring the inode size up to date.
Thus far, gfs2 has taken the filesystem locks inside the ->readpage and
->readpages address space operations. A better approach seems to be to
take those locks earlier, in the ->read_iter file and ->fault vm
operations. This would also avoid a lock inversion in ->readpages.
We obviously want to avoid taking the filesystem locks unnecessarily
when the pages we are looking for are already cached; otherwise, we
would cripple performance. So we need to check if those pages are
present first. That's actually exactly what the generic_file_read_iter
and filemap_fault helpers do already anyway, except that they will call
into ->readpage and ->readpages when they find pages missing. Instead
of that, we'd like those helpers to return with an error code that
allows us to retry the operation after taking the filesystem locks.
Thanks,
Andreas
Andreas Gruenbacher (3):
fs: Add IOCB_CACHED flag for generic_file_read_iter
fs: Add FAULT_FLAG_CACHED flag for filemap_fault
gfs2: Rework read and page fault locking
fs/gfs2/aops.c | 36 +++++--------------------
fs/gfs2/file.c | 66 ++++++++++++++++++++++++++++++++++++++++++++--
include/linux/fs.h | 1 +
include/linux/mm.h | 4 ++-
mm/filemap.c | 60 ++++++++++++++++++++++++++++++-----------
5 files changed, 119 insertions(+), 48 deletions(-)
--
2.20.1
Powered by blists - more mailing lists