lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 15 May 2023 14:30:42 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Horatiu Vultur <horatiu.vultur@...rochip.com>
Cc: netdev@...r.kernel.org
Subject: Re: Performance regression on lan966x when extracting frames

On Mon, May 15, 2023 at 11:12 AM Horatiu Vultur
<horatiu.vultur@...rochip.com> wrote:
>
> Hi,
>
> I have noticed that on the HEAD of net-next[0] there is a performance drop
> for lan966x when extracting frames towards the CPU. Lan966x has a Cortex
> A7 CPU. All the tests are done using iperf3 command like this:
> 'iperf3 -c 10.97.10.1 -R'
>
> So on net-next, I can see the following:
> [  5]   0.00-10.01  sec   473 MBytes   396 Mbits/sec  456 sender
> And it gets around ~97000 interrupts.
>
> While going back to the commit[1], I can see the following:
> [  5]   0.00-10.02  sec   632 MBytes   529 Mbits/sec   11 sender
> And it gets around ~1000 interrupts.
>
> I have done a little bit of searching and I have noticed that this
> commit [2] introduce the regression.
> I have tried to revert this commit on net-next and tried again, then I
> can see much better results but not exactly the same:
> [  5]   0.00-10.01  sec   616 MBytes   516 Mbits/sec    0 sender
> And it gets around ~700 interrupts.
>
> So my question is, was I supposed to change something in lan966x driver?
> or is there a bug in lan966x driver that pop up because of this change?
>
> Any advice will be great. Thanks!
>
> [0] befcc1fce564 ("sfc: fix use-after-free in efx_tc_flower_record_encap_match()")
> [1] d4671cb96fa3 ("Merge branch 'lan966x-tx-rx-improve'")
> [2] 8b43fd3d1d7d ("net: optimize ____napi_schedule() to avoid extra NET_RX_SOFTIRQ")
>
>

Hmmm... thanks for the report.

This seems related to softirq (k)scheduling.

Have you tried to apply this recent commit ?

Commit-ID:     d15121be7485655129101f3960ae6add40204463
Gitweb:        https://git.kernel.org/tip/d15121be7485655129101f3960ae6add40204463
Author:        Paolo Abeni <pabeni@...hat.com>
AuthorDate:    Mon, 08 May 2023 08:17:44 +02:00
Committer:     Thomas Gleixner <tglx@...utronix.de>
CommitterDate: Tue, 09 May 2023 21:50:27 +02:00

Revert "softirq: Let ksoftirqd do its job"


Alternative would be to try this :

diff --git a/net/core/dev.c b/net/core/dev.c
index b3c13e0419356b943e90b1f46dd7e035c6ec1a9c..f570a3ca00e7aa0e605178715f90bae17b86f071
100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6713,8 +6713,8 @@ static __latent_entropy void
net_rx_action(struct softirq_action *h)
        list_splice(&list, &sd->poll_list);
        if (!list_empty(&sd->poll_list))
                __raise_softirq_irqoff(NET_RX_SOFTIRQ);
-       else
-               sd->in_net_rx_action = false;
+
+       sd->in_net_rx_action = false;

        net_rps_action_and_irq_enable(sd);
 end:;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ