[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1208134.1681421217@warthog.procyon.org.uk>
Date: Thu, 13 Apr 2023 22:26:57 +0100
From: David Howells <dhowells@...hat.com>
To: Al Viro <viro@...iv.linux.org.uk>,
Matthew Wilcox <willy@...radead.org>
Cc: dhowells@...hat.com, netdev@...r.kernel.org,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>,
David Ahern <dsahern@...nel.org>,
Christoph Hellwig <hch@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Jens Axboe <axboe@...nel.dk>, Jeff Layton <jlayton@...nel.org>,
Christian Brauner <brauner@...nel.org>,
Chuck Lever III <chuck.lever@...cle.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Subject: How to determine if a page can be spliced into an skbuff, or if it should be copied/rejected?
Al Viro <viro@...iv.linux.org.uk> wrote:
> On Tue, Apr 11, 2023 at 05:08:50PM +0100, David Howells wrote:
> > Add a function to handle MSG_SPLICE_PAGES being passed internally to
> > sendmsg(). Pages are spliced into the given socket buffer if possible and
> > copied in if not (ie. they're slab pages or have a zero refcount).
>
> That "ie." would better be "e.g." - that condition is *not* enough for
> tell the unsafe ones from the rest.
>
> sendpage_ok() would be better off called "might_be_ok_to_sendpage()".
> If it's false, we'd better not grab a reference to the page and expect the
> sucker to stay safe until the reference is dropped. However, AFAICS
> it might return true on a page that is not safe in that respect.
>
> What rules do you propose for sendpage users? "Pass whatever page reference
> you want, it'll do the right thing"? Anything short of that would better
> be documented as explicitly as possible...
Hmmm... Fair point. Is everything passed through splice guaranteed to be
safe, I wonder? Probably not because vmsplice(). Does that mean the existing
callers of sendpage_ok() are also making unviable assumptions?
So there are the following 'classes' of memory that I can immediately think
of:
- Zero page Splice (no ref?)
- Kernel core data Splice
- Module core data (vmalloc'd) Splice
- Supervisor stack Copy
- Slab objects Copy
- Page frags Splice
- Other skbuff frags Splice
- Arbitrary pages (eg. sunrpc xdr buf) Splice (probably)
- Ordinary pipe buffers Splice
- Spliced tmpfs Splice
- Spliced pagecache (file/block) Splice
- Spliced DIO file/block Splice
- Vmspliced mmap'd anon Splice (with pin?)
- Vmspliced MAP_SHARED pagecache Splice (with pin?)
- Vmspliced MAP_SHARED DAX Splice?
- Vmspliced MAP_SHARED MTD Splice?
- Vmspliced MAP_SHARED other device Reject? (e.g. graphics card mem)
- Vmspliced /dev/{mem,kmem} Reject?
Question is how to tell that we're looking at something that must be copied or
rejected? sendpage_ok() checks the PG_slab bit and the pagecount, for
example.
David
Powered by blists - more mailing lists