[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iJb3k2kqZX9KQ-1tmw1L9Y0Lw4ksPRTeN97znS5Y3SJ4w@mail.gmail.com>
Date: Tue, 21 Mar 2023 20:03:24 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: yang.yang29@....com.cn
Cc: davem@...emloft.net, kuba@...nel.org, netdev@...r.kernel.org,
xu.xin16@....com.cn, jiang.xuexin@....com.cn,
zhang.yunkai@....com.cn
Subject: Re: [PATCH] rps: process the skb directly if rps cpu not changed
On Tue, Mar 21, 2023 at 5:12 AM <yang.yang29@....com.cn> wrote:
>
> From: xu xin <xu.xin16@....com.cn>
>
> In the RPS procedure of NAPI receiving, regardless of whether the
> rps-calculated CPU of the skb equals to the currently processing CPU, RPS
> will always use enqueue_to_backlog to enqueue the skb to per-cpu backlog,
> which will trigger a new NET_RX softirq.
>
> Actually, it's not necessary to enqueue it to backlog when rps-calculated
> CPU id equals to the current processing CPU, and we can call
> __netif_receive_skb or __netif_receive_skb_list to process the skb directly.
> The benefit is that it can reduce the number of softirqs of NET_RX and reduce
> the processing delay of skb.
>
> The measured result shows the patch brings 50% reduction of NET_RX softirqs.
> The test was done on the QEMU environment with two-core CPU by iperf3.
> taskset 01 iperf3 -c 192.168.2.250 -t 3 -u -R;
> taskset 02 iperf3 -c 192.168.2.250 -t 3 -u -R;
Current behavior is not an accident, this was a deliberate choice.
RPS was really for non multi queue devices.
Idea was to dequeue all packets and queue them on various cpu queues,
then at the end of napi->poll(), process 'our' packets.
This is how latencies were kept small (not head of line blocking)
Reducing the number of NET_RX softirqs is probably not changing performance.
Powered by blists - more mailing lists