netdev - Re: Bad performance in RX with sfc 40G

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220102092207.rxz7kpjii4ermnfo@gmail.com>
Date:   Sun, 2 Jan 2022 09:22:07 +0000
From:   Martin Habets <habetsm.xilinx@...il.com>
To:     Íñigo Huguet <ihuguet@...hat.com>
Cc:     Edward Cree <ecree.xilinx@...il.com>, netdev@...r.kernel.org,
        Dinan Gunawardena <dinang@...inx.com>
Subject: Re: Bad performance in RX with sfc 40G

Hi Íñigo,

On Thu, Dec 23, 2021 at 02:18:03PM +0100, Íñigo Huguet wrote:
> Hi Martin,
> 
> I replied this a few weeks ago, but it seems that, for some reason, I
> didn't CCd you.

I'm just getting back to work after my holidays. Happy new year!

> On Thu, Dec 9, 2021 at 1:06 PM Íñigo Huguet <ihuguet@...hat.com> wrote:
> >
> > Hi,
> >
> > On Sat, Nov 20, 2021 at 9:31 AM Martin Habets <habetsm.xilinx@...il.com> wrote:
> > > If you're testing without the IOMMU enabled I suspect the recycle ring
> > > size may be too small. Can your try the patch below?
> >
> > Sorry for the very late reply, but I've had to be out of work for many days.
> >
> > This patch has improved the performance a lot, reaching the same
> > 30Gbps than in TX. However, it seems sometimes a bit erratic, still
> > dropping to 15Gbps sometimes, specially after module remove & probe,
> > or from one iperf call to another. But not being all the times, I
> > didn't found a clear pattern. Anyway, it clearly improves things.

Thanks for the feedback. After module probe the RX cache is cold (empty),
as pages only get recycled as they come in.
The issue you see between iperf calls could be related to the NUMA
locality of the cache. After the 1st run the cache will contain pages for
the NUMA node that iperf ran on. If a subsequent run executes on a
different NUMA node the pages in the cache are further away.
This is where pinning the iperf runs to cores on the same NUMA node will
help.

> > Can this patch be applied as is or it's just a test?

The patch is good for a 40G NIC. But it won't be good enough on a 100G NIC,
and for a 10G NIC the size can be smaller.
I've been puzzling over the way to code this link speed dependency best.
We create this page ring when the interface is brought up, which is
before the driver knows the link speed. So I think it is best to
size it for the maximum speed of a given NIC.
In short, I'll work on a better patch for this.

Martin