lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <304d5286b6364da48a2bb1125155b7e5@AcuMS.aculab.com>
Date:   Fri, 10 Feb 2023 22:41:46 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Linus Torvalds' <torvalds@...ux-foundation.org>,
        Dave Chinner <david@...morbit.com>
CC:     Stefan Metzmacher <metze@...ba.org>, Jens Axboe <axboe@...nel.dk>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Linux API Mailing List <linux-api@...r.kernel.org>,
        io-uring <io-uring@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Al Viro <viro@...iv.linux.org.uk>,
        Samba Technical <samba-technical@...ts.samba.org>
Subject: RE: copy on write for splice() from file to pipe?

From: Linus Torvalds
> Sent: 10 February 2023 17:24
...
> And when it comes to networking, in general things like TCP checksums
> etc should be ok even with data that isn't stable.  When doing things
> by hand, networking should always use the "copy-and-checksum"
> functions that do the checksum while copying (so even if the source
> data changes, the checksum is going to be the checksum for the data
> that was copied).
> 
> And in many (most?) smarter network cards, the card itself does the
> checksum, again on the data as it is transferred from memory.
> 
> So it's not like "networking needs a stable source" is some really
> _fundamental_ requirement for things like that to work.

It is also worth remembering that TCP needs to be able
to retransmit the data and a much later time.
So the application must not change the data until it has
been acked by the remote system.

Operating systems that do asynchronous IO directly from
application buffers have callbacks/events to tell the
application when it is allowed to modify the buffers.
For TCP this won't be indicated until after the ACK
is received.
I don't think io_uring has any way to indicate anything
other than 'the data has been accepted by the socket'.

If you have 'kernel pages containing data' (eg from writes
into a pipe, or data received from a network) then they have
a single 'owner' and can be passed about.
But user-pages (including mmapped files) have multiple owners
so you are never going to be able to pass them as 'immutable
data'.
If you mmap a very large (and maybe sparse) file and then
try to do a very large (multi-GB) send() (with or without
any kind of page loaning) there is always the possibility
that the data that is actually sent was written while the
send() call was in progress.
Any kind of asynchronous send() just makes it more obvious.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ