[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20210531170123.243771-1-agruenba@redhat.com>
Date: Mon, 31 May 2021 19:01:14 +0200
From: Andreas Gruenbacher <agruenba@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Andreas Gruenbacher <agruenba@...hat.com>,
cluster-devel@...hat.com, linux-kernel@...r.kernel.org,
Alexander Viro <viro@...iv.linux.org.uk>,
Jan Kara <jack@...e.cz>, Matthew Wilcox <willy@...radead.org>
Subject: [RFC 0/9] gfs2: handle page faults during read and write
Hi Linus,
here's a set of fixes for how gfs2 handles page faults during read and
write syscalls. The patch queue is ready for merging except for two
open issues. May I ask you to shed some light or ask the right person
to help us out?
Right now, the filesystem can end up trying to take a lock it's already
holding, which causes it to BUG. We can recognize and mostly deal with
that simple case, but more complex scenarios exist which involve
multiple locks and / or incompatible locking modes. To handle those, we
switch to trylocks where necessary.
The patches appear to be working as itended, with the following
remaining questions:
(1) Jan Kara has pointed out [*] that returning VM_FAULT_SIGBUS from the
.fault / .page_mkwrite ops will raise SIGBUS, which would be visible to
user space. This hasn't been observed in our testing; instead, accesses
to the mmapped memory via call chains like:
iov_iter_fault_in_readable -> fault_in_pages_readable -> __get_user
would simply fail with -EFAULT, as we're expecting.
>From looking at do_user_addr_fault and do_sigbus in arch/x86/mm/fault.c,
my impression is that VM_FAULT_SIGBUS will not cause SIGBUS to be raised
in kernel mode, and that we can rely on the -EFAULT behavior for
triggering the retries at the filesystem level.
(2) The patch queue introduces the same kind of trylock behavior for
both .fault (gfs2_fault) and .page_mkwrite (gfs2_page_mkwrite). I'm not
aware of any situation in which we can actually end up in .page_mkwrite
during a .read_iter or .write_iter operation, so the trylock behavior in
.page_mkwrite might be harmless but unnecessary.
Thank you very much,
Andreas
[*] https://listman.redhat.com/archives/cluster-devel/2021-May/msg00080.html
Previous posting of this patch queue:
https://listman.redhat.com/archives/cluster-devel/2021-May/msg00073.html
New xfstest for mmap + page faults during read / write:
https://lore.kernel.org/fstests/20210531152604.240462-1-agruenba@redhat.com/
Andreas Gruenbacher (9):
gfs2: Clean up the error handling in gfs2_page_mkwrite
gfs2: Add wrapper for iomap_file_buffered_write
gfs2: Add gfs2_holder_is_compatible helper
gfs2: Fix mmap + page fault deadlocks (part 1)
iov_iter: Add iov_iter_fault_in_writeable()
gfs2: Add wrappers for accessing journal_info
gfs2: Encode glock holding and retry flags in journal_info
gfs2: Add LM_FLAG_OUTER glock holder flag
gfs2: Fix mmap + page fault deadlocks (part 2)
fs/gfs2/aops.c | 6 +-
fs/gfs2/bmap.c | 31 ++++----
fs/gfs2/file.c | 175 +++++++++++++++++++++++++++++++++-----------
fs/gfs2/glock.c | 12 +++
fs/gfs2/glock.h | 27 ++++++-
fs/gfs2/incore.h | 41 +++++++++++
fs/gfs2/inode.c | 2 +-
fs/gfs2/log.c | 4 +-
fs/gfs2/lops.c | 2 +-
fs/gfs2/meta_io.c | 6 +-
fs/gfs2/super.c | 2 +-
fs/gfs2/trans.c | 16 ++--
include/linux/uio.h | 1 +
lib/iov_iter.c | 20 ++++-
14 files changed, 265 insertions(+), 80 deletions(-)
--
2.26.3
Powered by blists - more mailing lists