lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YeVvXToTxCsMzHZv@casper.infradead.org>
Date:   Mon, 17 Jan 2022 13:30:05 +0000
From:   Matthew Wilcox <willy@...radead.org>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     David Howells <dhowells@...hat.com>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Anna Schumaker <anna.schumaker@...app.com>,
        Dave Wysochanski <dwysocha@...hat.com>,
        Dominique Martinet <asmadeus@...ewreck.org>,
        Jeff Layton <jlayton@...nel.org>,
        Latchesar Ionkov <lucho@...kov.net>,
        Marc Dionne <marc.dionne@...istor.com>,
        Omar Sandoval <osandov@...ndov.com>,
        Shyam Prasad N <nspmangalore@...il.com>,
        Steve French <sfrench@...ba.org>,
        Trond Myklebust <trondmy@...merspace.com>,
        Peter Zijlstra <peterz@...radead.org>,
        ceph-devel@...r.kernel.org, linux-afs@...ts.infradead.org,
        linux-cachefs@...hat.com, CIFS <linux-cifs@...r.kernel.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>,
        "open list:NFS, SUNRPC, AND..." <linux-nfs@...r.kernel.org>,
        v9fs-developer@...ts.sourceforge.net,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Out of order read() completion and buffer filling beyond
 returned amount

On Mon, Jan 17, 2022 at 12:19:29PM +0200, Linus Torvalds wrote:
> On Mon, Jan 17, 2022 at 11:57 AM David Howells <dhowells@...hat.com> wrote:
> >
> > Do you have an opinion on whether it's permissible for a filesystem to write
> > into the read() buffer beyond the amount it claims to return, though still
> > within the specified size of the buffer?
> 
> I'm pretty sure that would seriously violate POSIX in the general
> case, and maybe even break some programs that do fancy buffer
> management (ie I could imagine some circular buffer thing that expects
> any "unwritten" ('unread'?) parts to stay with the old contents)
> 
> That said, that's for generic 'read()' cases for things like tty's or
> pipes etc that can return partial reads in the first place.
> 
> If it's a regular file, then any partial read *already* violates
> POSIX, and nobody sane would do any such buffer management because
> it's supposed to be a 'can't happen' thing.
> 
> And since you mention DIO, that's doubly true, and is already outside
> basic POSIX, and has already violated things like "all or nothing"
> rules for visibility of writes-vs-reads (which admittedly most Linux
> filesystems have violated even outside of DIO, since the strictest
> reading of the rules are incredibly nasty anyway). But filesystems
> like XFS which took some of the strict rules more seriously already
> ignored them for DIO, afaik.

I think for DIO, you're sacrificing the entire buffer with any filesystem.
If the underlying file is split across multiple drives, or is even
just fragmented on a single drive, we'll submit multiple BIOs which
will complete independently (even for SCSI which writes sequentially;
never mind NVMe which can DMA blocks asynchronously).  It might be
more apparent in a networking situation where errors are more common,
but it's always been a possibility since Linux introduced DIO.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ