lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Thu, 29 Jun 2023 16:54:29 +0100
From: David Howells <dhowells@...hat.com>
To: netdev@...r.kernel.org
Cc: David Howells <dhowells@...hat.com>,
	Matthew Wilcox <willy@...radead.org>,
	Dave Chinner <david@...morbit.com>,
	Matt Whitlock <kernel@...twhitlock.name>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Jens Axboe <axboe@...nel.dk>,
	linux-fsdevel@...ck.org,
	linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: [RFC PATCH 0/4] splice: Fix corruption in data spliced to pipe

Due to the way splice() and vmsplice() currently splice active pages from
the pagecache or process VM into the intermediary pipe, changes to the data
in those pages can occur whilst they're held in the pipe by such as
write(), writing through a shared-writable mmap or using fallocate() to
mangle the file[1] change the data.

Matt Whitlock, Matthew Wilcox and Dave Chinner are of the opinion that data
in the pipe must not be seen to change and that if it does, this is a bug.
Apart from in one specific instance (vmsplice() with SPLICE_F_GIFT), the
manual pages agree with them.  I'm more inclined to adjust the
documentation since the behaviour we have has been that way since 2005, I
think.

These patches attempt to fix this by stealing a page if possible and
copying the data if not before splice() or vmsplice() adds it to the pipe.

Whilst this does allow the code to be somewhat simplified, it also results
in a loss of performance: stolen pages have to be reloaded in accessed
again; more data has to be copied.

Ideally, this should result in all pages in the pipe belonging solely to
the pipe and so they can be removed from the pipe and spliced into
pagecaches or process VM immediately with no further checking required.

Note that this conversion is incomplete.  It does not simplify fuse and
virtio_console and it does not clean up the splicing into pipes from
relayfs, watch_queue and sockets.

There's also a bug in the vmsplice() page stealing.  It mostly works but
after splicing a bunch of pages, it will oops somewhere in the interval
tree's macros.

I've pushed the patches here also:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=splice-fix-corruption

David

Link: https://lore.kernel.org/r/ec804f26-fa76-4fbe-9b1c-8fbbd829b735@mattwhitlock.name/ [1]

David Howells (4):
  splice: Fix corruption of spliced data after splice() returns
  splice: Make vmsplice() steal or copy
  splice: Remove some now-unused bits
  splice: Record some statistics

 fs/fuse/dev.c             |  37 -----
 fs/pipe.c                 |  12 --
 fs/splice.c               | 304 ++++++++++++++++++--------------------
 include/linux/pipe_fs_i.h |  14 --
 include/linux/splice.h    |   4 +-
 mm/filemap.c              |  98 +++++++++++-
 mm/internal.h             |   4 +-
 mm/shmem.c                |   8 +-
 8 files changed, 245 insertions(+), 236 deletions(-)


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ