netdev - Re: [RFC] Applicability of using 'txq_trans

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <7f2d508c-1e21-9ee4-1cef-9b2dbb7e0a02@broadcom.com>
Date:   Tue, 12 Apr 2022 15:21:54 -0700
From:   Ray Jui <ray.jui@...adcom.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     Michael Chan <michael.chan@...adcom.com>,
        "David S. Miller" <davem@...emloft.net>,
        Netdev <netdev@...r.kernel.org>
Subject: Re: [RFC] Applicability of using 'txq_trans_update' during ring
 recovery



On 4/12/2022 2:49 PM, Jakub Kicinski wrote:
> On Tue, 12 Apr 2022 12:34:23 -0700 Ray Jui wrote:
>> Sure, that is the general error recovery case which is very different
>> from this specific recovery case we are discussing here. This specific
>> recovery is solely performed by driver (without resetting firmware and
>> chip) on a per NAPI ring set basis. While a specific NAPI ring set is
>> being recovered, traffic is still going with the rest of the NAPI ring
>> sets. Average recovery time is in the 1 - 2 ms range in this type of
>> recovery.
>>
>> Also as I already said, 'netif_carrier_off' is not an option given that
>> the RoCE/infiniband subsystem has a dependency on 'netif_carrier_status'
>> for many of their operations.
>>
>> Basically I'm looking for a solution that allows one to be able to:
>> 1) quieice traffic and perform recovery on a per NAPI ring set basis
>> 2) During recovery, it does not cause any drastic effect on RoCE
>>
>> 'txq_trans_update' may not be the most optimal solution, but it is a
>> solution that satisfies the two requirements above. If there are any
>> other option that is considered more optimal than 'txq_trans_update' and
>> can satisfy the two requirements, please let me know.
> 
> The optimal solution would be to not have to reset your rings and
> pretend like nothing happened :/

Yes, I wish we have a more robust HW so we don't need to deal with this
in SW. Unfortunately not my choice.

> If you can't reset the ring in time
> you'll have to live with the splat. End of story.

But the splat is not caused by the fact that we cannot recovery in time;
instead, it is caused by the fact there was no activity on the TX queue
for some time and now when we stop the individual TX queue (without
tapping off at netif level), TX timeout can be falsely triggered.

Basically it sounds like we currently do not have an optimal way in the
kernel to allow us to stop and re-start TX queue on a per NAPI ring set
basis (whether we need it or not can be a separate discussion). Is this
statement correct?

Thanks!

Download attachment "smime.p7s" of type "application/pkcs7-signature" (4194 bytes)