[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEA6p_Dx8KVjLnBOdrNTqDJBu+4z5bF51yc7KO9OzqjU0Hqy4Q@mail.gmail.com>
Date: Mon, 28 Sep 2020 11:15:24 -0700
From: Wei Wang <weiwan@...gle.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: "David S . Miller" <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Hannes Frederic Sowa <hannes@...essinduktion.org>,
Felix Fietkau <nbd@....name>
Subject: Re: [RFC PATCH net-next 0/6] implement kthread based napi poll
On Mon, Sep 28, 2020 at 10:43 AM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Mon, Sep 14, 2020 at 7:26 PM Wei Wang <weiwan@...gle.com> wrote:
> >
> > The idea of moving the napi poll process out of softirq context to a
> > kernel thread based context is not new.
> > Paolo Abeni and Hannes Frederic Sowa has proposed patches to move napi
> > poll to kthread back in 2016. And Felix Fietkau has also proposed
> > patches of similar ideas to use workqueue to process napi poll just a
> > few weeks ago.
> >
> > The main reason we'd like to push forward with this idea is that the
> > scheduler has poor visibility into cpu cycles spent in softirq context,
> > and is not able to make optimal scheduling decisions of the user threads.
> > For example, we see in one of the application benchmark where network
> > load is high, the CPUs handling network softirqs has ~80% cpu util. And
> > user threads are still scheduled on those CPUs, despite other more idle
> > cpus available in the system. And we see very high tail latencies. In this
> > case, we have to explicitly pin away user threads from the CPUs handling
> > network softirqs to ensure good performance.
> > With napi poll moved to kthread, scheduler is in charge of scheduling both
> > the kthreads handling network load, and the user threads, and is able to
> > make better decisions. In the previous benchmark, if we do this and we
> > pin the kthreads processing napi poll to specific CPUs, scheduler is
> > able to schedule user threads away from these CPUs automatically.
> >
> > And the reason we prefer 1 kthread per napi, instead of 1 workqueue
> > entity per host, is that kthread is more configurable than workqueue,
> > and we could leverage existing tuning tools for threads, like taskset,
> > chrt, etc to tune scheduling class and cpu set, etc. Another reason is
> > if we eventually want to provide busy poll feature using kernel threads
> > for napi poll, kthread seems to be more suitable than workqueue.
> >
> > In this patch series, I revived Paolo and Hannes's patch in 2016 and
> > left them as the first 2 patches. Then there are changes proposed by
> > Felix, Jakub, Paolo and myself on top of those, with suggestions from
> > Eric Dumazet.
> >
> > In terms of performance, I ran tcp_rr tests with 1000 flows with
> > various request/response sizes, with RFS/RPS disabled, and compared
> > performance between softirq vs kthread. Host has 56 hyper threads and
> > 100Gbps nic.
> >
> > req/resp QPS 50%tile 90%tile 99%tile 99.9%tile
> > softirq 1B/1B 2.19M 284us 987us 1.1ms 1.56ms
> > kthread 1B/1B 2.14M 295us 987us 1.0ms 1.17ms
> >
> > softirq 5KB/5KB 1.31M 869us 1.06ms 1.28ms 2.38ms
> > kthread 5KB/5KB 1.32M 878us 1.06ms 1.26ms 1.66ms
> >
> > softirq 1MB/1MB 10.78K 84ms 166ms 234ms 294ms
> > kthread 1MB/1MB 10.83K 82ms 173ms 262ms 320ms
> >
> > I also ran one application benchmark where the user threads have more
> > work to do. We do see good amount of tail latency reductions with the
> > kthread model.
>
>
>
> Wei, this is a very nice work.
>
> Please re-send it without the RFC tag, so that we can hopefully merge it ASAP.
>
> Thanks !
Thank you Eric! Will prepare the official patch series and send it out soon.
Powered by blists - more mailing lists