lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 14 Feb 2019 17:06:23 -0800
From:   Florian Fainelli <f.fainelli@...il.com>
To:     David Miller <davem@...emloft.net>, jose.abreu@...opsys.com
Cc:     netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
        joao.pinto@...opsys.com, peppe.cavallaro@...com,
        alexandre.torgue@...com
Subject: Re: [PATCH net] net: stmmac: Fix NAPI poll in TX path when in
 multi-queue

On 2/14/19 9:01 AM, David Miller wrote:
> From: Jose Abreu <jose.abreu@...opsys.com>
> Date: Wed, 13 Feb 2019 18:00:43 +0100
> 
>> Commit 8fce33317023 introduced the concept of NAPI per-channel and
>> independent cleaning of TX path.
>>
>> This is currently breaking performance in some cases. The scenario
>> happens when all packets are being received in Queue 0 but the TX is
>> performed in Queue != 0.
>>
>> I didn't look very deep but it seems that NAPI for Queue 0 will clean
>> the RX path but as TX is in different NAPI, this last one is called at a
>> slower rate which kills performance in TX. I suspect this is due to TX
>> cleaning takes much longer than RX and because NAPI will get canceled
>> once we return with 0 budget consumed (e.g. when TX is still not done it
>> will return 0 budget).
>>
>> Fix this by looking at all TX channels in NAPI poll function.
>>
>> Signed-off-by: Jose Abreu <joabreu@...opsys.com>
>> Fixes: 8fce33317023 ("net: stmmac: Rework coalesce timer and fix multi-queue races")
> 
> No this isn't right.
> 
> The TX interrupt events for Queue != 0 should clean up the TX packets
> on those queues.
> 
> Furthermore you are breaking the locality of the TX processing.
> 
> I'm not applying this, sorry.

Agreed, why don't you create per-queue NAPI instances such that they are
all independent and can complete their TX completion/RX processing
entirely separately?
-- 
Florian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ