[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230629155433.4170837-1-dhowells@redhat.com>
Date: Thu, 29 Jun 2023 16:54:29 +0100
From: David Howells <dhowells@...hat.com>
To: netdev@...r.kernel.org
Cc: David Howells <dhowells@...hat.com>,
Matthew Wilcox <willy@...radead.org>,
Dave Chinner <david@...morbit.com>,
Matt Whitlock <kernel@...twhitlock.name>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Jens Axboe <axboe@...nel.dk>,
linux-fsdevel@...ck.org,
linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: [RFC PATCH 0/4] splice: Fix corruption in data spliced to pipe
Due to the way splice() and vmsplice() currently splice active pages from
the pagecache or process VM into the intermediary pipe, changes to the data
in those pages can occur whilst they're held in the pipe by such as
write(), writing through a shared-writable mmap or using fallocate() to
mangle the file[1] change the data.
Matt Whitlock, Matthew Wilcox and Dave Chinner are of the opinion that data
in the pipe must not be seen to change and that if it does, this is a bug.
Apart from in one specific instance (vmsplice() with SPLICE_F_GIFT), the
manual pages agree with them. I'm more inclined to adjust the
documentation since the behaviour we have has been that way since 2005, I
think.
These patches attempt to fix this by stealing a page if possible and
copying the data if not before splice() or vmsplice() adds it to the pipe.
Whilst this does allow the code to be somewhat simplified, it also results
in a loss of performance: stolen pages have to be reloaded in accessed
again; more data has to be copied.
Ideally, this should result in all pages in the pipe belonging solely to
the pipe and so they can be removed from the pipe and spliced into
pagecaches or process VM immediately with no further checking required.
Note that this conversion is incomplete. It does not simplify fuse and
virtio_console and it does not clean up the splicing into pipes from
relayfs, watch_queue and sockets.
There's also a bug in the vmsplice() page stealing. It mostly works but
after splicing a bunch of pages, it will oops somewhere in the interval
tree's macros.
I've pushed the patches here also:
https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=splice-fix-corruption
David
Link: https://lore.kernel.org/r/ec804f26-fa76-4fbe-9b1c-8fbbd829b735@mattwhitlock.name/ [1]
David Howells (4):
splice: Fix corruption of spliced data after splice() returns
splice: Make vmsplice() steal or copy
splice: Remove some now-unused bits
splice: Record some statistics
fs/fuse/dev.c | 37 -----
fs/pipe.c | 12 --
fs/splice.c | 304 ++++++++++++++++++--------------------
include/linux/pipe_fs_i.h | 14 --
include/linux/splice.h | 4 +-
mm/filemap.c | 98 +++++++++++-
mm/internal.h | 4 +-
mm/shmem.c | 8 +-
8 files changed, 245 insertions(+), 236 deletions(-)
Powered by blists - more mailing lists