netdev - Re: [PATCH net-next 6/6] enetc: Add adaptive interrupt coalescing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20200714095454.35705c36@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date:   Tue, 14 Jul 2020 09:54:54 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Claudiu Manoil <claudiu.manoil@....com>
Cc:     "David S . Miller" <davem@...emloft.net>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next 6/6] enetc: Add adaptive interrupt coalescing

On Tue, 14 Jul 2020 11:21:45 +0000 Claudiu Manoil wrote:
> >Does it really make sense to implement DIM for TX?
> >
> >For TX the only thing we care about is that no queue in the system
> >underflows. So the calculation is simply timeout = queue len / speed.
> >The only problem is which queue in the system is the smallest (TX
> >ring, TSQ etc.) but IMHO there's little point in the extra work to
> >calculate the thresholds dynamically. On real life workloads the
> >scheduler overhead the async work structs introduce cause measurable
> >regressions.
> >
> >That's just to share my experience, up to you to decide if you want
> >to keep the TX-side DIM or not :)  
> 
> Yeah, I'm not happy either with Tx DIM, it seems too much for this device,
> too much overhead.
> But it seemed there's no other option left, because leaving coalescing as
> disabled for Tx is not an option as there are too many Tx interrupts, but
> on the other hand coming up with a single Tx coalescing time threshold to
> cover all the possible cases is not feasible either.  However your suggestion
> to compute the Tx coalescing values based on link speed, at least that's how
> I read it, is worth investigating.  This device is supposed to handle link speeds
> ranging from 10Mbit to 2.5G, so it would be great if TX DIM could be replaced
> replaced in this case by a set of precomputed values based on link speed.
> I'm going to look into this.  If you have any other suggestion on this pls let me know.

If you were happy with TX DIM - my guess would be that even if you
leave the TX coalescing with the value optimal for 2.5G - it will be
perfectly fine for other speeds, too. TX DIM is quite aggressive, if
I'm reading the code correctly it maxes out at 64us - which is a low
value for TX.

In my experiments with 25G NICs and TCP workloads (and some synthetic
netperf TCP_RR) the optimal value seems to be TSQ / link speed (- some
safety margin). Which is ~360us for 25G, since the TSQ value was bumped
to 1MB in recent kernels.

Obviously YMMV if the system is running a routing or raw socket app.
Then you presumably want to sustain max throughput on 2.5G with min
sized frames. And your rings by default hold 256 entries - that's still
~50us to complete a ring.