[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACDg6nWEAKWU3s1x+NRU28BcXHK0=yFkAAU4MMkSTgEA9g592w@mail.gmail.com>
Date: Tue, 24 Jun 2025 14:31:18 -0400
From: Andy Gospodarek <andrew.gospodarek@...adcom.com>
To: Michael Chan <michael.chan@...adcom.com>
Cc: Jesper Dangaard Brouer <hawk@...nel.org>, Yan Zhai <yan@...udflare.com>, netdev@...r.kernel.org,
Pavan Chebbi <pavan.chebbi@...adcom.com>, Andrew Lunn <andrew+netdev@...n.ch>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>, John Fastabend <john.fastabend@...il.com>,
Stanislav Fomichev <sdf@...ichev.me>, linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
kernel-team@...udflare.com
Subject: Re: [PATCH net] bnxt: properly flush XDP redirect lists
On Tue, Jun 24, 2025 at 2:00 PM Michael Chan <michael.chan@...adcom.com> wrote:
>
> On Mon, Jun 23, 2025 at 10:59 PM Jesper Dangaard Brouer <hawk@...nel.org> wrote:
> >
> > On 23/06/2025 18.06, Yan Zhai wrote:
> > > We encountered following crash when testing a XDP_REDIRECT feature
> > > in production:
> > >
> > [...]
> > >
> > (To Andy + Michael:)
> > The initial bug was introduced in [1] commit a7559bc8c17c ("bnxt:
> > support transmit and free of aggregation buffers") in bnxt_rx_xdp()
> > where case XDP_TX zeros the *event, that also carries the XDP-redirect
> > indication.
> > I'm wondering if the driver should not reset the *event value?
> > (all other drive code paths doesn't)
>
> Resetting *event was only correct before XDP_REDIRECT support was added.
>
> >
> >
> > > We can stably reproduce this crash by returning XDP_TX
> > > and XDP_REDIRECT randomly for incoming packets in a naive XDP program.
> > > Properly propagate the XDP_REDIRECT events back fixes the crash.
>
> Thanks for the patch. The fix is similar to edc0140cc3b7 ("bnxt_en:
> Flush XDP for bnxt_poll_nitroa0()'s NAPI")
>
> Somehow the fix was only applied to one chip's poll function and not
> the other chips' poll functions.
Odd that we missed this back then. Thanks for the fix for all other devices.
> Reviewed-by: Michael Chan <michael.chan@...adcom.com>
Reviewed-by: Andy Gospodarek <gospo@...adcom.com>
Powered by blists - more mailing lists