lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <fea8b16d-5a69-40f9-b123-e84dcd6e8f2e@www.fastmail.com>
Date:   Mon, 24 May 2021 11:42:52 +0100
From:   "Will Manley" <will@...liammanley.net>
To:     linux-fsdevel@...r.kernel.org
Cc:     "Dave Chinner" <david@...morbit.com>,
        "Kent Overstreet" <kent.overstreet@...il.com>,
        "Matthew Wilcox (Oracle)" <willy@...radead.org>,
        "Jens Axboe" <axboe@...nel.dk>, linux-kernel@...r.kernel.org,
        "Alice Ryhl" <alice@...l.io>, br0adcast <br0adcast.007@...il.com>
Subject: BUG: preadv2(.., RWF_NOWAIT) returns spurious EOF

Hi All

We've seen preadv2(..., -1, RWF_NOWAIT) return 0 when at offset 4096 in a file much larger than 4096B.  This breaks code that reads an entire file because the 0 return makes it believe that it's already read the whole file. We came across this when investigating a bug reported against the Rust async I/O library tokio. The latest release now takes advantage of RWF_NOWAIT for file I/O, but it's caused problems for users.

https://github.com/tokio-rs/tokio/issues/3803

The issue is readily reproducible. We've tested on armv7, i686 and x86_64 with the ext4 filesystem.  Here's the strace output:

preadv2(9, [{iov_base=..., iov_len=32}], 1, -1, RWF_NOWAIT) = 32
preadv2(9, [{iov_base=..., iov_len=32}], 1, -1, RWF_NOWAIT) = 32
preadv2(9, [{iov_base=..., iov_len=64}], 1, -1, RWF_NOWAIT) = 64
preadv2(9, [{iov_base=..., iov_len=128}], 1, -1, RWF_NOWAIT) = 128
preadv2(9, [{iov_base=..., iov_len=256}], 1, -1, RWF_NOWAIT) = 256
preadv2(9, [{iov_base=..., iov_len=512}], 1, -1, RWF_NOWAIT) = 512
preadv2(9, [{iov_base=..., iov_len=1024}], 1, -1, RWF_NOWAIT) = 1024
preadv2(9, [{iov_base=..., iov_len=2048}], 1, -1, RWF_NOWAIT) = 2048
preadv2(9, [{iov_base="", iov_len=4096}], 1, -1, RWF_NOWAIT) = 0

I'm not certain that it's caused by the offset being 4096.  Maybe it's that the data will be written into an uncommitted page causes the bug? I'm not certain.

The bug is present in Linux 5.9 and 5.10, but was fixed in Linux 5.11.  I've run a bisect and it was introduced in 

    efa8480a831 fs: RWF_NOWAIT should imply IOCB_NOIO

and fixed in

    06c0444290 mm/filemap.c: generic_file_buffered_read() now uses find_get_pages_contig

This is already fixed but I thought it would be important to report it as the fix seems to be incidental.  The fix commit message doesn't mention anything about bugs so I wonder if the underlying issue still exists.

Our current plan is to add a uname check and to disable using the RWF_NOWAIT optimisation on 5.9 and 5.10.  Given that we don't understand the bug I thought it would be best to check with you. Maybe there's a better way of detecting the presence of this bug?

There's more information at https://github.com/tokio-rs/tokio/issues/3803

Thanks

Will

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ