lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 29 Jun 2023 19:34:08 +0100
From: Matthew Wilcox <willy@...radead.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Matt Whitlock <kernel@...twhitlock.name>,
	David Howells <dhowells@...hat.com>, netdev@...r.kernel.org,
	Dave Chinner <david@...morbit.com>, Jens Axboe <axboe@...nel.dk>,
	linux-fsdevel@...ck.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 0/4] splice: Fix corruption in data spliced to pipe

On Thu, Jun 29, 2023 at 11:19:36AM -0700, Linus Torvalds wrote:
> On Thu, 29 Jun 2023 at 11:05, Matt Whitlock <kernel@...twhitlock.name> wrote:
> >
> > I don't know why SPLICE_F_MOVE is being ignored in this thread. Sure, maybe
> > the way it has historically been implemented was only relevant when the
> > input FD is a pipe, but that's not what the man page implies. You have the
> > opportunity to make it actually do what it says on the tin.
> 
> First off, when documentation and reality disagree, it's the
> documentation that is garbage.
> 
> Secondly, your point is literally moot, from what I can tell:
> 
>        SPLICE_F_MOVE
>               Unused for vmsplice(); see splice(2).
> 
> that's the doc I see right now for "man vmsplice".
> 
> There's no "implies" there. There's an actual big honking clear
> statement at the top of the man-page saying that what you claim is
> simply not even remotely true.
> 
> Also, the reason SPLICE_F_MOVE is unused for vmsplice() is that
> actually trying to move pages would involve having to *remove* them
> from the VM source. And the TLB invalidation involved with that is
> literally more expensive than the memory copy would be.

I think David muddied the waters by talking about vmsplice().  The
problem encountered is with splice() from the page cache.  Reading
the documentation,

       splice()  moves  data  between two file descriptors without copying be‐
       tween kernel address space and user address space.  It transfers up  to
       len bytes of data from the file descriptor fd_in to the file descriptor
       fd_out, where one of the file descriptors must refer to a pipe.

The bug reported is actually with using FALLOC_FL_PUNCH_HOLE, but a
simpler problem is:

#define _GNU_SOURCE
#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>

#define PAGE_SIZE 4096

int main(int argc, char **argv)
{
        int fd = open(argv[1], O_RDWR | O_CREAT, 0644);

        err = ftruncate(fd, PAGE_SIZE);
        pwrite(fd, "old", 3, 0);
        splice(fd, NULL, 1, NULL, PAGE_SIZE, 0);
        pwrite(fd, "new", 3, 0);

        return 0;
}

That outputs "new".  Should it?  If so, the manpage is really wrong.
It says the point of splice() is to remove the kernel-user-kernel copy,
and notes that zerocopy might be happening, but that's an optimisation
the user shouldn't notice.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ