lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <CANn89i+QT3nfE-nN9b6eeyMBp93CVHZYteuH6N9ErKYqF8PA=A@mail.gmail.com> Date: Tue, 16 May 2023 10:04:32 +0200 From: Eric Dumazet <edumazet@...gle.com> To: Horatiu Vultur <horatiu.vultur@...rochip.com> Cc: netdev@...r.kernel.org Subject: Re: Performance regression on lan966x when extracting frames On Tue, May 16, 2023 at 9:45 AM Horatiu Vultur <horatiu.vultur@...rochip.com> wrote: > > The 05/15/2023 14:30, Eric Dumazet wrote: > > > > On Mon, May 15, 2023 at 11:12 AM Horatiu Vultur > > <horatiu.vultur@...rochip.com> wrote: > > Hi Eric, > > Thanks for looking at this. > > > > > > > Hi, > > > > > > I have noticed that on the HEAD of net-next[0] there is a performance drop > > > for lan966x when extracting frames towards the CPU. Lan966x has a Cortex > > > A7 CPU. All the tests are done using iperf3 command like this: > > > 'iperf3 -c 10.97.10.1 -R' > > > > > > So on net-next, I can see the following: > > > [ 5] 0.00-10.01 sec 473 MBytes 396 Mbits/sec 456 sender > > > And it gets around ~97000 interrupts. > > > > > > While going back to the commit[1], I can see the following: > > > [ 5] 0.00-10.02 sec 632 MBytes 529 Mbits/sec 11 sender > > > And it gets around ~1000 interrupts. > > > > > > I have done a little bit of searching and I have noticed that this > > > commit [2] introduce the regression. > > > I have tried to revert this commit on net-next and tried again, then I > > > can see much better results but not exactly the same: > > > [ 5] 0.00-10.01 sec 616 MBytes 516 Mbits/sec 0 sender > > > And it gets around ~700 interrupts. > > > > > > So my question is, was I supposed to change something in lan966x driver? > > > or is there a bug in lan966x driver that pop up because of this change? > > > > > > Any advice will be great. Thanks! > > > > > > [0] befcc1fce564 ("sfc: fix use-after-free in efx_tc_flower_record_encap_match()") > > > [1] d4671cb96fa3 ("Merge branch 'lan966x-tx-rx-improve'") > > > [2] 8b43fd3d1d7d ("net: optimize ____napi_schedule() to avoid extra NET_RX_SOFTIRQ") > > > > > > > > > > Hmmm... thanks for the report. > > > > This seems related to softirq (k)scheduling. > > > > Have you tried to apply this recent commit ? > > > > Commit-ID: d15121be7485655129101f3960ae6add40204463 > > Gitweb: https://git.kernel.org/tip/d15121be7485655129101f3960ae6add40204463 > > Author: Paolo Abeni <pabeni@...hat.com> > > AuthorDate: Mon, 08 May 2023 08:17:44 +02:00 > > Committer: Thomas Gleixner <tglx@...utronix.de> > > CommitterDate: Tue, 09 May 2023 21:50:27 +02:00 > > > > Revert "softirq: Let ksoftirqd do its job" > > I have tried to apply this patch but the results are the same: > [ 5] 0.00-10.01 sec 478 MBytes 400 Mbits/sec 188 sender > And it gets just a little bit bigger number of interrupts ~11000 > > > > > > > Alternative would be to try this : > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > index b3c13e0419356b943e90b1f46dd7e035c6ec1a9c..f570a3ca00e7aa0e605178715f90bae17b86f071 > > 100644 > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -6713,8 +6713,8 @@ static __latent_entropy void > > net_rx_action(struct softirq_action *h) > > list_splice(&list, &sd->poll_list); > > if (!list_empty(&sd->poll_list)) > > __raise_softirq_irqoff(NET_RX_SOFTIRQ); > > - else > > - sd->in_net_rx_action = false; > > + > > + sd->in_net_rx_action = false; > > > > net_rps_action_and_irq_enable(sd); > > end:; > > I have tried to use also this change with and without the previous patch > but the result is the same: > [ 5] 0.00-10.01 sec 478 MBytes 401 Mbits/sec 256 sender > And it is the same number of interrupts. > > Is something else that I should try? High number of interrupts for a saturated receiver seems wrong. (Unless it is not saturating the cpu ?) Perhaps hard irqs are not properly disabled by this driver. You also could try using napi_schedule_prep(), just in case it helps. diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c index bd72fbc2220f3010afd8b90f3704e261b9d0a98f..4694f4f34e6caf5cf540ada17a472c3c57f10823 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c @@ -628,10 +628,12 @@ irqreturn_t lan966x_fdma_irq_handler(int irq, void *args) err = lan_rd(lan966x, FDMA_INTR_ERR); if (db) { - lan_wr(0, lan966x, FDMA_INTR_DB_ENA); - lan_wr(db, lan966x, FDMA_INTR_DB); + if (napi_schedule_prep(&lan966x->napi)) { + lan_wr(0, lan966x, FDMA_INTR_DB_ENA); + lan_wr(db, lan966x, FDMA_INTR_DB); - napi_schedule(&lan966x->napi); + __napi_schedule(&lan966x->napi); + } } if (err) {
Powered by blists - more mailing lists