[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACKFLikGR5qa8+ReLi2krEH9En=5QRv0txEVcM2FE-W6Lc6UuA@mail.gmail.com>
Date: Tue, 11 Jul 2023 01:00:24 -0700
From: Michael Chan <michael.chan@...adcom.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: davem@...emloft.net, netdev@...r.kernel.org, edumazet@...gle.com,
pabeni@...hat.com
Subject: Re: [PATCH net-next 3/3] eth: bnxt: handle invalid Tx completions
more gracefully
On Mon, Jul 10, 2023 at 1:56 PM Jakub Kicinski <kuba@...nel.org> wrote:
>
> Invalid Tx completions should never happen (tm) but when they do
> they crash the host, because driver blindly trusts that there is
> a valid skb pointer on the ring.
>
> The completions I've seen appear to be some form of FW / HW
> miscalculation or staleness, they have typical (small) values
> (<100), but they are most often higher than number of queued
> descriptors. They usually happen after boot.
>
> Instead of crashing print a warning and schedule a reset.
It generally looks good to me. I have a few comments below.
The logic is very similar to the bnapi->in_reset logic to reset due to
RX errors. We have a counter for the number of times we do the RX
reset so I think it might be good to add a similar TX reset counter.
The XDP code path can potentially crash in a similar way if we get a
bad completion from hardware. I'm not sure if we should add similar
logic to the XDP code path.
Thanks.
Download attachment "smime.p7s" of type "application/pkcs7-signature" (4209 bytes)
Powered by blists - more mailing lists