linux-kernel - Re: [PATCH net] net: stmmac: Fix NAPI poll in TX path when in multi-queue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <f8da6c15-991f-b774-a856-d3e8e45a33b6@gmail.com>
Date:   Thu, 14 Feb 2019 17:06:23 -0800
From:   Florian Fainelli <f.fainelli@...il.com>
To:     David Miller <davem@...emloft.net>, jose.abreu@...opsys.com
Cc:     netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
        joao.pinto@...opsys.com, peppe.cavallaro@...com,
        alexandre.torgue@...com
Subject: Re: [PATCH net] net: stmmac: Fix NAPI poll in TX path when in
 multi-queue

On 2/14/19 9:01 AM, David Miller wrote:
> From: Jose Abreu <jose.abreu@...opsys.com>
> Date: Wed, 13 Feb 2019 18:00:43 +0100
> 
>> Commit 8fce33317023 introduced the concept of NAPI per-channel and
>> independent cleaning of TX path.
>>
>> This is currently breaking performance in some cases. The scenario
>> happens when all packets are being received in Queue 0 but the TX is
>> performed in Queue != 0.
>>
>> I didn't look very deep but it seems that NAPI for Queue 0 will clean
>> the RX path but as TX is in different NAPI, this last one is called at a
>> slower rate which kills performance in TX. I suspect this is due to TX
>> cleaning takes much longer than RX and because NAPI will get canceled
>> once we return with 0 budget consumed (e.g. when TX is still not done it
>> will return 0 budget).
>>
>> Fix this by looking at all TX channels in NAPI poll function.
>>
>> Signed-off-by: Jose Abreu <joabreu@...opsys.com>
>> Fixes: 8fce33317023 ("net: stmmac: Rework coalesce timer and fix multi-queue races")
> 
> No this isn't right.
> 
> The TX interrupt events for Queue != 0 should clean up the TX packets
> on those queues.
> 
> Furthermore you are breaking the locality of the TX processing.
> 
> I'm not applying this, sorry.

Agreed, why don't you create per-queue NAPI instances such that they are
all independent and can complete their TX completion/RX processing
entirely separately?
-- 
Florian