[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1505990276.29839.120.camel@edumazet-glaptop3.roam.corp.google.com>
Date: Thu, 21 Sep 2017 03:37:56 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: David Miller <davem@...emloft.net>, netdev@...r.kernel.org,
pablo@...filter.org, fw@...len.de, edumazet@...gle.com,
hannes@...essinduktion.org
Subject: Re: [PATCH net-next 0/5] net: introduce noref sk
On Thu, 2017-09-21 at 11:42 +0200, Paolo Abeni wrote:
> Hi,
>
> Thanks for the feedback!
>
> On Wed, 2017-09-20 at 20:20 -0700, David Miller wrote:
> > From: Paolo Abeni <pabeni@...hat.com>
> > Date: Wed, 20 Sep 2017 18:54:00 +0200
> >
> > > This series introduce the infrastructure to store inside the skb a socket
> > > pointer without carrying a refcount to the socket.
> > >
> > > Such infrastructure is then used in the network receive path - and
> > > specifically the early demux operation.
> > >
> > > This allows the UDP early demux to perform a full lookup for UDP sockets,
> > > with many benefits:
> > >
> > > - the UDP early demux code is now much simpler
> > > - the early demux does not hit any performance penalties in case of UDP hash
> > > table collision - previously the early demux performed a partial, unsuccesful,
> > > lookup
> > > - early demux is now operational also for unconnected sockets.
> > >
> > > This infrastrcture will be used in follow-up series to allow dst caching for
> > > unconnected UDP sockets, and than to extend the same features to TCP listening
> > > sockets.
> >
> > Like Eric, I find this series (while exciting) quite scary :-)
> >
> > You really have to post some kind of performance numbers in your
> > header posting in order to justify something with these ramifications
> > and scale.
>
> This is actually a preparatory work for the next series which will
> bring in the real gain. The next patches are still to be polished so we
> posted this separately to get some early feedback.
>
> If that would help, I can post the follow-up soon as RFC. Overall -
> with the follow-up appplied, too - when using a single rx ingress
> queue, I measured ~20% tput gain for unconnected ipv4 sockets - with
> rp_filter disabled - and ~30% for ipv6 sockets. In case of multiple
> ingress queues, the gain is smaller but still measurable (roughly 5%).
>
> Please let me know if you prefer the see the full work early.
I want to see the full work yes. Ipv6, and everything.
I do not want ~1000 lines of changed code in the stack for some corner
cases, where people do not properly use existing infra, like proper
SO_REUSEPORT with proper BPF filter to have as many clean siloes (proper
CPU/NUMA affinities to avoid QPI traffic)
The complexity of your patches reached a point where I am extremely
nervous.
Thanks.
Powered by blists - more mailing lists