netdev - Re: [PATCH net-next v3 01/18] net: Copy slab data for sendmsg(MSG_SPLICE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1952674.1687462843@warthog.procyon.org.uk>
Date: Thu, 22 Jun 2023 20:40:43 +0100
From: David Howells <dhowells@...hat.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: dhowells@...hat.com, netdev@...r.kernel.org,
    Alexander Duyck <alexander.duyck@...il.com>,
    "David S. Miller" <davem@...emloft.net>,
    Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
    Willem de Bruijn <willemdebruijn.kernel@...il.com>,
    David Ahern <dsahern@...nel.org>,
    Matthew Wilcox <willy@...radead.org>, Jens Axboe <axboe@...nel.dk>,
    linux-mm@...ck.org, linux-kernel@...r.kernel.org,
    Menglong Dong <imagedong@...cent.com>
Subject: Re: [PATCH net-next v3 01/18] net: Copy slab data for sendmsg(MSG_SPLICE_PAGES)

Jakub Kicinski <kuba@...nel.org> wrote:

> > If sendmsg() is passed MSG_SPLICE_PAGES and is given a buffer that contains
> > some data that's resident in the slab, copy it rather than returning EIO.
> 
> How did that happen? I thought MSG_SPLICE_PAGES comes from former
> sendpage users and sendpage can't operate on slab pages.

Some of my patches, take the siw one for example, now aggregate all the bits
that make up a message into a single sendmsg() call, including any protocol
header and trailer in the same bio_vec[] as the payload where before it would
have to do, say, sendmsg+sendpage+sendpage+...+sendpage+sendmsg.

I'm trying to make it so that I make the minimum number of sendmsg calls
(ie. 1 where possible) and the loop that processes the data is inside of that.
This offers the opportunity, at least in the future, to append slab data to an
already-existing private fragment in the skbuff.

> The locking is to local_bh_disable(). Does the milliont^w new frag
> allocator have any additional benefits?

It is shareable because it does locking.  Multiple sockets of multiple
protocols can share the pages it has reserved.  It drops the lock around calls
to the page allocator so that GFP_KERNEL/GFP_NOFS can be used with it.

Without this, the page fragment allocator would need to be per-socket, I
think, or be done further up the stack where the higher level drivers would
have to have a fragment bucket per whatever unit they use to deal with the
lack of locking.

Doing it here makes cleanup simpler since I just transfer my ref on the
fragment to the skbuff frag list and it will automatically be cleaned up with
the skbuff.

Willy suggested that I just allocate a page for each thing I want to copy, but
I would rather not do that for, say, an 8-byte bit of protocol data.

David