lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6fad62328ef919ada8036735cd0b835900d2f747.camel@redhat.com>
Date:   Fri, 31 Aug 2018 15:44:12 +0200
From:   Paolo Abeni <pabeni@...hat.com>
To:     Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc:     Sowmini Varadhan <sowmini.varadhan@...cle.com>,
        Network Development <netdev@...r.kernel.org>,
        Willem de Bruijn <willemb@...gle.com>
Subject: Re: [PATCH RFC net-next 00/11] udp gso

On Fri, 2018-08-31 at 09:08 -0400, Willem de Bruijn wrote:
> On Fri, Aug 31, 2018 at 5:09 AM Paolo Abeni <pabeni@...hat.com> wrote:
> > 
> > Hi,
> > 
> > On Tue, 2018-04-17 at 17:07 -0400, Willem de Bruijn wrote:
> > > That said, for negotiated flows an inverse GRO feature could
> > > conceivably be implemented to reduce rx stack traversal, too.
> > > Though due to interleaving of packets on the wire, it aggregation
> > > would be best effort, similar to TCP TSO and GRO using the
> > > PSH bit as packetization signal.
> > 
> > Reviving this old thread, before I forgot again. I have some local
> > patches implementing UDP GRO in a dual way to current GSO_UDP_L4
> > implementation: several datagram with the same length are aggregated
> > into a single one, and the user space receive a single larger packet
> > instead of multiple ones. I hope quic can leverage such scenario, but I
> > really know nothing about the protocol.
> > 
> > I measure roughly a 50% performance improvement with udpgso_bench in
> > respect to UDP GSO, and ~100% using a pktgen sender, and a reduced CPU
> > usage on the receiver[1]. Some additional hacking to the general GRO
> > bits is required to avoid useless socket lookups for ingress UDP
> > packets when UDP_GSO is not enabled.
> > 
> > If there is interest on this topic, I can share some RFC patches
> > (hopefully somewhat next week).
> 
> As Eric pointed out, QUIC reception on mobile clients over the WAN
> may not see much gain. But apparently there is a non-trivial amount
> of traffic the other way, to servers. Again, WAN might limit whatever
> gain we get, but I do want to look into that. And there are other UDP high
> throughput workloads (with or without QUIC) between servers.
> 
> If you have patches, please do share them. 

I'll try to clean them up and send them next week (as RFC).

> I actually also have a rough
> patch that I did not consider ready to share yet. Based on Tom's existing
> socket lookup in udp_gro_receive to detect whether a local destination
> exists and whether it has set an option to support receiving coalesced
> payloads (along with a cmsg to share the segment size).

That is more or less what I'm doing here.
Side note: I had test it in baremetal, as veth/lo do not trigger the
GRO path: selftest of this feature is not so straightforward.

> Converting udp_recvmsg to split apart gso packets to make this
> transparent seems to me to be too complex and not worth the effort.

Agreed. Moreover doing many, small, recvmsg() instead of a single,
large, one will hit the performances very badly due to PTI and
HARDENED_USERCOPY.

> If a local socket is not found in udp_gro_receive, this could also be
> tentative interpreted as a non-local path (with false positives), enabling
> transparent use of GRO + GSO batching on the forwarding path.

That sounds interesting, even if false positive looks dangerous to me.
Just to be on the same page, which false positive examples are you
thinking at? UDP sockets bound to local address behind NAT?

Cheers,

Paolo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ