[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260107010503.2242163-5-boolli@google.com>
Date: Wed, 7 Jan 2026 01:05:03 +0000
From: Li Li <boolli@...gle.com>
To: Tony Nguyen <anthony.l.nguyen@...el.com>,
Przemek Kitszel <przemyslaw.kitszel@...el.com>, "David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Eric Dumazet <edumazet@...gle.com>, intel-wired-lan@...ts.osuosl.org
Cc: netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
David Decotigny <decot@...gle.com>, Anjali Singhai <anjali.singhai@...el.com>,
Sridhar Samudrala <sridhar.samudrala@...el.com>, Brian Vazquez <brianvv@...gle.com>,
Li Li <boolli@...gle.com>, emil.s.tantilov@...el.com
Subject: [PATCH 5/5] idpf: skip stopping/opening vport if it is NULL during HW reset
When an idpf HW reset is triggered, it clears the vport but does
not clear the netdev held by vport:
// In idpf_vport_dealloc() called by idpf_init_hard_reset(),
// idpf_init_hard_reset() sets IDPF_HR_RESET_IN_PROG, so
// idpf_decfg_netdev() doesn't get called.
if (!test_bit(IDPF_HR_RESET_IN_PROG, adapter->flags))
idpf_decfg_netdev(vport);
// idpf_decfg_netdev() would clear netdev but it isn't called:
unregister_netdev(vport->netdev);
free_netdev(vport->netdev);
vport->netdev = NULL;
// Later in idpf_init_hard_reset(), the vport is cleared:
kfree(adapter->vports);
adapter->vports = NULL;
During an idpf HW reset, when userspace restarts the network service,
the vport associated with the netdev is NULL, and so a kernel panic would
happen:
[ 1791.669339] BUG: kernel NULL pointer dereference, address: 0000000000000070
...
[ 1791.717130] RIP: 0010:idpf_vport_stop+0x16/0x1c0
This can be reproduced reliably by injecting a TX timeout to cause
an idpf HW reset, and injecting a virtchnl error to cause the HW
reset to fail and retry, while running "service network restart" in
userspace.
With this patch applied, we see the following error but no kernel
panics anymore:
[ 181.409483] idpf 0000:05:00.0 eth1: mtu not changed due to no vport innetdev
RTNETLINK answers: Bad address
...
[ 181.913644] idpf 0000:05:00.0 eth1: not stopping vport because it is NULL
[ 181.938675] idpf 0000:05:00.0 eth1: mtu not changed due to no vport in netdev
...
[ 242.849499] idpf 0000:05:00.0 eth1: not opening vport because it is NULL
...
[ 304.289364] idpf 0000:05:00.0 eth0: not opening vport because it is NULL
Signed-off-by: Li Li <boolli@...gle.com>
---
drivers/net/ethernet/intel/idpf/idpf_lib.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c
index 53b31989722a7..a9a556499262b 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_lib.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c
@@ -1021,6 +1021,8 @@ static void idpf_vport_stop(struct idpf_vport *vport, bool rtnl)
*/
static int idpf_stop(struct net_device *netdev)
{
+ if (!netdev)
+ return 0;
struct idpf_netdev_priv *np = netdev_priv(netdev);
struct idpf_vport *vport;
@@ -1029,9 +1031,14 @@ static int idpf_stop(struct net_device *netdev)
idpf_vport_ctrl_lock(netdev);
vport = idpf_netdev_to_vport(netdev);
+ if (!vport) {
+ netdev_err(netdev, "not stopping vport because it is NULL");
+ goto unlock;
+ }
idpf_vport_stop(vport, false);
+unlock:
idpf_vport_ctrl_unlock(netdev);
return 0;
@@ -2301,6 +2308,11 @@ static int idpf_open(struct net_device *netdev)
idpf_vport_ctrl_lock(netdev);
vport = idpf_netdev_to_vport(netdev);
+ if (!vport) {
+ netdev_err(netdev, "not opening vport because it is NULL");
+ err = -EFAULT;
+ goto unlock;
+ }
err = idpf_set_real_num_queues(vport);
if (err)
--
2.52.0.351.gbe84eed79e-goog
Powered by blists - more mailing lists