lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <642ad8b66acfe_302ae1208e7@willemb.c.googlers.com.notmuch>
Date:   Mon, 03 Apr 2023 09:46:30 -0400
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     David Howells <dhowells@...hat.com>,
        Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc:     dhowells@...hat.com, Matthew Wilcox <willy@...radead.org>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        Christoph Hellwig <hch@...radead.org>,
        Jens Axboe <axboe@...nel.dk>, Jeff Layton <jlayton@...nel.org>,
        Christian Brauner <brauner@...nel.org>,
        Chuck Lever III <chuck.lever@...cle.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        netdev@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v3 15/55] ip, udp: Support MSG_SPLICE_PAGES

David Howells wrote:
> Willem de Bruijn <willemdebruijn.kernel@...il.com> wrote:
> 
> > > +	} else if ((flags & MSG_SPLICE_PAGES) && length) {
> > > +		if (inet->hdrincl)
> > > +			return -EPERM;
> > > +		if (rt->dst.dev->features & NETIF_F_SG)
> > > +			/* We need an empty buffer to attach stuff to */
> > > +			initial_length = transhdrlen;
> > 
> > I still don't entirely understand what initial_length means.
> > 
> > More importantly, transhdrlen can be zero. If not called for UDP
> > but for RAW. Or if this is a subsequent call to a packet that is
> > being held with MSG_MORE.
> > 
> > This works fine for existing use-cases, which go to alloc_new_skb.
> > Not sure how this case would be different. But the comment alludes
> > that it does.
> 
> The problem is that in the non-MSG_ZEROCOPY case, __ip_append_data() assumes
> that it's going to copy the data it is given and will allocate sufficient
> space in the skb in advance to hold it - but I don't want to do that because I
> want to splice in the pages holding the data instead.  However, I do need to
> allocate space to hold the transport header.
> 
> Maybe I should change 'initial_length' to 'initial_alloc'?  It represents the
> amount I think we should allocate.  Or maybe I should have a separate
> allocation clause for MSG_SPLICE_PAGES?

The code already has to avoid allocation in the MSG_ZEROCOPY case. I
added alloc_len and paged_len for that purpose.

Only the transhdrlen will be copied with getfrag due to

    copy = datalen - transhdrlen - fraggap - pagedlen

On next iteration in the loop, when remaining data fits in the skb,
there are three cases. The first is skipped due to !NETIF_F_SG. The
other two are either copy to page frags or zerocopy page frags.

I think your code should be able to fit in. Maybe easier if it could
reuse the existing alloc_new_skb code to copy the transport header, as
MSG_ZEROCOPY does, rather than adding a new __ip_splice_alloc branch
that short-circuits that. Then __ip_splice_pages also does not need
code to copy the initial header. But this is trickier. It's fine to
leave as is.

Since your code currently does call continue before executing the rest
of that branch, no need to modify any code there? Notably replacing
length with initial_length, which itself is initialized to length in
all cases expect for MSG_SPLICE_PAGES.

Just hardcode transhdrlen as the copy argument to __ip_splice_pages.
> I also wonder if __ip_append_data() really needs two places that call
> getfrag().
> 
> David
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ