[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACKFLin6HPBCY_L_nEzvTzfuObB1modqfB3p8OXk1cH=f7U4HQ@mail.gmail.com>
Date: Wed, 19 Jul 2023 23:59:06 -0700
From: Michael Chan <michael.chan@...adcom.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: davem@...emloft.net, netdev@...r.kernel.org, edumazet@...gle.com,
pabeni@...hat.com
Subject: Re: [PATCH net-next v2 0/3] eth: bnxt: handle invalid Tx completions
more gracefully
On Wed, Jul 19, 2023 at 6:05 PM Jakub Kicinski <kuba@...nel.org> wrote:
>
> bnxt trusts the events generated by the device which may lead to kernel
> crashes. These are extremely rare but they do happen. For a while
> I thought crashing may be intentional, because device reporting invalid
> completions should never happen, and having a core dump could be useful
> if it does. But in practice I haven't found any clues in the core dumps,
> and panic_on_warn exists.
>
> Series was tested by forcing the recovery path manually. Because of
> how rare the real crashes are I can't confirm it works for the actual
> device errors until it's been widely deployed.
>
> v2:
> - factor out the reset scheduling
> - also add a check on the XDP path
> v1: https://lore.kernel.org/all/20230710205611.1198878-1-kuba@kernel.org/
>
> Jakub Kicinski (3):
> eth: bnxt: move and rename reset helpers
> eth: bnxt: take the bit to set as argument of bnxt_queue_sp_work()
> eth: bnxt: handle invalid Tx completions more gracefully
>
For the series,
Reviewed-by: Michael Chan <michael.chan@...adcom.com>
I have already done the patches to preserve the ring error counters
and it's being tested internally. I will add the tx reset counter
when I post the patches. Thanks.
Powered by blists - more mailing lists