[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1021352.1746193306@warthog.procyon.org.uk>
Date: Fri, 02 May 2025 14:41:46 +0100
From: David Howells <dhowells@...hat.com>
To: Andrew Lunn <andrew@...n.ch>
Cc: dhowells@...hat.com, David Hildenbrand <david@...hat.com>,
John Hubbard <jhubbard@...dia.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, willy@...radead.org,
netdev@...r.kernel.org, linux-mm@...ck.org
Subject: MSG_ZEROCOPY and the O_DIRECT vs fork() race
Andrew Lunn <andrew@...n.ch> wrote:
> > I'm looking into making the sendmsg() code properly handle the 'DIO vs
> > fork' issue (where pages need pinning rather than refs taken) and also
> > getting rid of the taking of refs entirely as the page refcount is going
> > to go away in the relatively near future.
>
> Sorry, new to this conversation, and i don't know what you mean by DIO
> vs fork.
As I understand it, there's a race between O_DIRECT I/O and fork whereby if
you, say, start a DIO read operation on a page and then fork, the target page
gets attached to child and a copy made for the parent (because the refcount is
elevated by the I/O) - and so only the child sees the result. This is made
more interesting by such as AIO where the parent gets the completion
notification, but not the data.
Further, a DIO write is then alterable by the child if the DMA has not yet
happened.
One of the things mm/gup.c does is to work around this issue... However, I
don't think that MSG_ZEROCOPY handles this - and so zerocopy sendmsg is, I
think, subject to the same race.
> Could you point me at a discussion.
I don't know of one, offhand, apart from in the logs for mm/gup.c. I've added
a couple more mm guys and the mm list to the cc: field.
The information in the description of fc1d8e7cca2daa18d2fe56b94874848adf89d7f5
may be relevant.
David
Powered by blists - more mailing lists