netdev - Re: [PATCH v5 2/6] ksm: support unsharing zero pages placed by KSM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20230322072142.32751-1-xu.xin16@zte.com.cn>
Date:   Wed, 22 Mar 2023 07:21:42 +0000
From:   xu xin <xu.xin.sc@...il.com>
To:     linyunsheng@...wei.com, kuba@...nel.org
Cc:     davem@...emloft.net, edumazet@...gle.com, jiang.xuexin@....com.cn,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        xu.xin16@....com.cn, yang.yang29@....com.cn,
        zhang.yunkai@....com.cn
Subject: Re: [PATCH v5 2/6] ksm: support unsharing zero pages placed by KSM

On 2023/3/21 20:12, yang.yang29@....com.cn wrote:
>> From: xu xin <xu.xin16@....com.cn>
>> 
>> In the RPS procedure of NAPI receiving, regardless of whether the
>> rps-calculated CPU of the skb equals to the currently processing CPU, RPS
>> will always use enqueue_to_backlog to enqueue the skb to per-cpu backlog,
>> which will trigger a new NET_RX softirq.
>
>Does bypassing the backlog cause out of order problem for packet handling?
>It seems currently the RPS/RFS will ensure order delivery,such as:
>https://elixir.bootlin.com/linux/v6.3-rc3/source/net/core/dev.c#L4485
>
>Also, this is an optimization, it should target the net-next branch:
>[PATCH net-next] rps: process the skb directly if rps cpu not changed
>

Well, I thought the patch would't break the effort RFS tried to avoid "Out of
Order" packets. But thanks for your reminder, I rethink it again, bypassing the
backlog from "netif_receive_skb_list" will mislead RFS's judging if all
previous packets for the flow have been dequeued, where RFS thought all packets
have been dealed with, but actually they are still in skb lists. Fortunately,
bypassing the backlog from "netif_receive_skb" for a single skb is okay and won't
cause OOO packets because every skb is processed serially by RPS and sent to the
protocol stack as soon as possible.

If I'm correct, the code as follws can fix this.

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5666,8 +5666,9 @@ static int netif_receive_skb_internal(struct sk_buff *skb)
        if (static_branch_unlikely(&rps_needed)) {
                struct rps_dev_flow voidflow, *rflow = &voidflow;
                int cpu = get_rps_cpu(skb->dev, skb, &rflow);
+               int current_cpu = smp_processor_id();
 
-               if (cpu >= 0) {
+               if (cpu >= 0 && cpu != current_cpu) {
                        ret = enqueue_to_backlog(skb, cpu, &rflow->last_qtail);
                        rcu_read_unlock();
                        return ret;
@@ -5699,11 +5700,15 @@ void netif_receive_skb_list_internal(struct list_head *head)
                list_for_each_entry_safe(skb, next, head, list) {
                        struct rps_dev_flow voidflow, *rflow = &voidflow;
                        int cpu = get_rps_cpu(skb->dev, skb, &rflow);
+                       int current_cpu = smp_processor_id();
 
                        if (cpu >= 0) {
                                /* Will be handled, remove from list */
                                skb_list_del_init(skb);
-                               enqueue_to_backlog(skb, cpu, &rflow->last_qtail);
+                               if (cpu != current_cpu)
+                                       enqueue_to_backlog(skb, cpu, &rflow->last_qtail);
+                               else
+                                       __netif_receive_skb(skb);
                        }
                }


Thanks.

>> 
>> Actually, it's not necessary to enqueue it to backlog when rps-calculated
>> CPU id equals to the current processing CPU, and we can call
>> __netif_receive_skb or __netif_receive_skb_list to process the skb directly.
>> The benefit is that it can reduce the number of softirqs of NET_RX and reduce
>> the processing delay of skb.
>> 
>> The measured result shows the patch brings 50% reduction of NET_RX softirqs.
>> The test was done on the QEMU environment with two-core CPU by iperf3.
>> taskset 01 iperf3 -c 192.168.2.250 -t 3 -u -R;
>> taskset 02 iperf3 -c 192.168.2.250 -t 3 -u -R;
>> 
>> Previous RPS:
>> 		    	CPU0       CPU1
>> NET_RX:         45          0    (before iperf3 testing)
>> NET_RX:        1095         241   (after iperf3 testing)
>> 
>> Patched RPS:
>>                 CPU0       CPU1
>> NET_RX:         28          4    (before iperf3 testing)
>> NET_RX:         573         32   (after iperf3 testing)
>
>Sincerely.
>Xu Xin