[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260108083131.6e090e86@kernel.org>
Date: Thu, 8 Jan 2026 08:31:31 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Ankit Garg <nktgrg@...gle.com>
Cc: Joshua Washington <joshwash@...gle.com>, netdev@...r.kernel.org,
Harshitha Ramamurthy <hramamurthy@...gle.com>, Andrew Lunn
<andrew+netdev@...n.ch>, "David S. Miller" <davem@...emloft.net>, Eric
Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>, Willem de
Bruijn <willemb@...gle.com>, Praveen Kaligineedi <pkaligineedi@...gle.com>,
Catherine Sullivan <csully@...gle.com>, Luigi Rizzo <lrizzo@...gle.com>,
Jon Olson <jonolson@...gle.com>, Sagi Shahar <sagis@...gle.com>, Bailey
Forrest <bcf@...gle.com>, linux-kernel@...r.kernel.org,
stable@...r.kernel.org
Subject: Re: [PATCH net 0/2] gve: fix crashes on invalid TX queue indices
On Thu, 8 Jan 2026 07:35:59 -0800 Ankit Garg wrote:
> On Tue, Jan 6, 2026 at 6:22 PM Jakub Kicinski <kuba@...nel.org> wrote:
> > On Mon, 5 Jan 2026 15:25:02 -0800 Joshua Washington wrote:
> > > This series fixes a kernel panic in the GVE driver caused by
> > > out-of-bounds array access when the network stack provides an invalid
> > > TX queue index.
> >
> > Do you know how? I seem to recall we had such issues due to bugs
> > in the qdisc layer, most of which were fixed.
> >
> > Fixing this at the source, if possible, would be far preferable
> > to sprinkling this condition to all the drivers.
>
> That matches our observation—we have encountered this panic on older
> kernels (specifically Rocky Linux 8) but have not been able to
> reproduce it on recent upstream kernels.
>
> Could you point us to the specific qdisc fixes you recall? We'd like
> to verify if the issue we are seeing on the older kernel is indeed one
> of those known/fixed bugs.
Very old - ac5b70198adc25
> If it turns out this is fully resolved in the core network stack
> upstream, we can drop this patch for the mainline driver. However, if
> there is ambiguity, do you think there is value in keeping this check
> to prevent the driver from crashing on invalid input?
The API contract is that the stack does not send frames for queues
which don't exist (> real_num_tx_queues) down to the drivers.
There's no ambiguity, IMO, if the stack sends such frames its a bug
in the stack.
Powered by blists - more mailing lists