[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAODvEq6E-TLJ5Z0L3dfB3NgQrCRuQ9W=-g97hw+1yM+yJB_7iw@mail.gmail.com>
Date: Thu, 15 Jan 2026 12:07:12 -0800
From: Li Li <boolli@...gle.com>
To: Paul Menzel <pmenzel@...gen.mpg.de>
Cc: Tony Nguyen <anthony.l.nguyen@...el.com>,
Przemek Kitszel <przemyslaw.kitszel@...el.com>, "David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Eric Dumazet <edumazet@...gle.com>, intel-wired-lan@...ts.osuosl.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
David Decotigny <decot@...gle.com>, Anjali Singhai <anjali.singhai@...el.com>,
Sridhar Samudrala <sridhar.samudrala@...el.com>, Brian Vazquez <brianvv@...gle.com>,
emil.s.tantilov@...el.com
Subject: Re: [Intel-wired-lan] [PATCH 1/2] idpf: skip deallocating bufq_sets
from rx_qgrp if it is NULL.
On Mon, Jan 12, 2026 at 10:31 PM Paul Menzel <pmenzel@...gen.mpg.de> wrote:
>
> Dear Li,
>
>
> Thank you for your patch.
>
> Am 13.01.26 um 00:09 schrieb Li Li via Intel-wired-lan:
> > In idpf_rxq_group_alloc(), if rx_qgrp->splitq.bufq_sets failed to get
> > allocated:
> >
> > rx_qgrp->splitq.bufq_sets = kcalloc(vport->num_bufqs_per_qgrp,
> > sizeof(struct idpf_bufq_set),
> > GFP_KERNEL);
> > if (!rx_qgrp->splitq.bufq_sets) {
> > err = -ENOMEM;
> > goto err_alloc;
> > }
> >
> > idpf_rxq_group_rel() would attempt to deallocate it in
> > idpf_rxq_sw_queue_rel(), causing a kernel panic:
> >
> > ```
> > [ 7.967242] early-network-sshd-n-rexd[3148]: knetbase: Info: [ 8.127804] BUG: kernel NULL pointer dereference, address: 00000000000000c0
> > ...
> > [ 8.129779] RIP: 0010:idpf_rxq_group_rel+0x101/0x170
> > ...
> > [ 8.133854] Call Trace:
> > [ 8.133980] <TASK>
> > [ 8.134092] idpf_vport_queues_alloc+0x286/0x500
> > [ 8.134313] idpf_vport_open+0x4d/0x3f0
> > [ 8.134498] idpf_open+0x71/0xb0
> > [ 8.134668] __dev_open+0x142/0x260
> > [ 8.134840] netif_open+0x2f/0xe0
> > [ 8.135004] dev_open+0x3d/0x70
> > [ 8.135166] bond_enslave+0x5ed/0xf50
> > [ 8.135345] ? nla_put_ifalias+0x3d/0x90
> > [ 8.135533] ? kvfree_call_rcu+0xb5/0x3b0
> > [ 8.135725] ? kvfree_call_rcu+0xb5/0x3b0
> > [ 8.135916] do_set_master+0x114/0x160
> > [ 8.136098] do_setlink+0x412/0xfb0
> > [ 8.136269] ? security_sock_rcv_skb+0x2a/0x50
> > [ 8.136509] ? sk_filter_trim_cap+0x7c/0x320
> > [ 8.136714] ? skb_queue_tail+0x20/0x50
> > [ 8.136899] ? __nla_validate_parse+0x92/0xe50
> > [ 8.137112] ? security_capable+0x35/0x60
> > [ 8.137304] rtnl_newlink+0x95c/0xa00
> > [ 8.137483] ? __rtnl_unlock+0x37/0x70
> > [ 8.137664] ? netdev_run_todo+0x63/0x530
> > [ 8.137855] ? allocate_slab+0x280/0x870
> > [ 8.138044] ? security_capable+0x35/0x60
> > [ 8.138235] rtnetlink_rcv_msg+0x2e6/0x340
> > [ 8.138431] ? __pfx_rtnetlink_rcv_msg+0x10/0x10
> > [ 8.138650] netlink_rcv_skb+0x16a/0x1a0
> > [ 8.138840] netlink_unicast+0x20a/0x320
> > [ 8.139028] netlink_sendmsg+0x304/0x3b0
> > [ 8.139217] __sock_sendmsg+0x89/0xb0
> > [ 8.139399] ____sys_sendmsg+0x167/0x1c0
> > [ 8.139588] ? ____sys_recvmsg+0xed/0x150
> > [ 8.139780] ___sys_sendmsg+0xdd/0x120
> > [ 8.139960] ? ___sys_recvmsg+0x124/0x1e0
> > [ 8.140152] ? rcutree_enqueue+0x1f/0xb0
> > [ 8.140341] ? rcutree_enqueue+0x1f/0xb0
> > [ 8.140528] ? call_rcu+0xde/0x2a0
> > [ 8.140695] ? evict+0x286/0x2d0
> > [ 8.140856] ? rcutree_enqueue+0x1f/0xb0
> > [ 8.141043] ? kmem_cache_free+0x2c/0x350
> > [ 8.141236] __x64_sys_sendmsg+0x72/0xc0
> > [ 8.141424] do_syscall_64+0x6f/0x890
> > [ 8.141603] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > [ 8.141841] RIP: 0033:0x7f2799d21bd0
> > ...
> > [ 8.149905] Kernel panic - not syncing: Fatal exception
> > [ 8.175940] Kernel Offset: 0xf800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > [ 8.176425] Rebooting in 10 seconds..
> > ```
> >
> > Tested: With this patch, the kernel panic no longer appears.
>
> Is it easy to reproduce?
In our internal environments, we have the idpf driver running on
machines with small RAM, and it's not uncommon for
them to run out of memory and encounter kalloc issues, especially in
kcallocs where we allocate higher order memory.
To reliably reproduce the issue in my own testing, I simply set
rx_qgrp->splitq.bufq_sets to NULL:
rx_qgrp->splitq.bufq_sets = kcalloc(vport->num_bufqs_per_qgrp,
sizeof(struct idpf_bufq_set),
GFP_KERNEL);
rx_qgrp->splitq.bufq_sets = NULL;
If the error path works correctly, we should not see a kernel panic.
>
> > Fixes: 95af467d9a4e ("idpf: configure resources for RX queues")
> >
>
> (Just for the future, a blank in the “tag section” is uncommon.)
Thank you for the info!
>
> > Signed-off-by: Li Li <boolli@...gle.com>
> > ---
> > drivers/net/ethernet/intel/idpf/idpf_txrx.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
> > index e7b131dba200c..b4dab4a8ee11b 100644
> > --- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
> > +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
> > @@ -1337,6 +1337,8 @@ static void idpf_txq_group_rel(struct idpf_vport *vport)
> > static void idpf_rxq_sw_queue_rel(struct idpf_rxq_group *rx_qgrp)
> > {
> > int i, j;
> > + if (!rx_qgrp->splitq.bufq_sets)
> > + return;
> >
> > for (i = 0; i < rx_qgrp->vport->num_bufqs_per_qgrp; i++) {
> > struct idpf_bufq_set *bufq_set = &rx_qgrp->splitq.bufq_sets[i];
>
> Reviewed-by: Paul Menzel <pmenzel@...gen.mpg.de>
>
>
> Kind regards,
>
> Paul
Powered by blists - more mailing lists