[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <IA3PR11MB898636DB7C48DDB28FF88A09E582A@IA3PR11MB8986.namprd11.prod.outlook.com>
Date: Fri, 9 Jan 2026 06:06:36 +0000
From: "Loktionov, Aleksandr" <aleksandr.loktionov@...el.com>
To: Li Li <boolli@...gle.com>, "Nguyen, Anthony L"
<anthony.l.nguyen@...el.com>, "Kitszel, Przemyslaw"
<przemyslaw.kitszel@...el.com>, "David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Eric Dumazet <edumazet@...gle.com>,
"intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "David
Decotigny" <decot@...gle.com>, "Singhai, Anjali" <anjali.singhai@...el.com>,
"Samudrala, Sridhar" <sridhar.samudrala@...el.com>, Brian Vazquez
<brianvv@...gle.com>, "Tantilov, Emil S" <emil.s.tantilov@...el.com>
Subject: RE: [Intel-wired-lan] [PATCH 5/5] idpf: skip stopping/opening vport
if it is NULL during HW reset
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@...osl.org> On Behalf
> Of Li Li via Intel-wired-lan
> Sent: Wednesday, January 7, 2026 2:05 AM
> To: Nguyen, Anthony L <anthony.l.nguyen@...el.com>; Kitszel,
> Przemyslaw <przemyslaw.kitszel@...el.com>; David S. Miller
> <davem@...emloft.net>; Jakub Kicinski <kuba@...nel.org>; Eric
> Dumazet <edumazet@...gle.com>; intel-wired-lan@...ts.osuosl.org
> Cc: netdev@...r.kernel.org; linux-kernel@...r.kernel.org; David
> Decotigny <decot@...gle.com>; Singhai, Anjali
> <anjali.singhai@...el.com>; Samudrala, Sridhar
> <sridhar.samudrala@...el.com>; Brian Vazquez <brianvv@...gle.com>;
> Li Li <boolli@...gle.com>; Tantilov, Emil S
> <emil.s.tantilov@...el.com>
> Subject: [Intel-wired-lan] [PATCH 5/5] idpf: skip stopping/opening
> vport if it is NULL during HW reset
>
> When an idpf HW reset is triggered, it clears the vport but does not
> clear the netdev held by vport:
>
> // In idpf_vport_dealloc() called by idpf_init_hard_reset(),
> // idpf_init_hard_reset() sets IDPF_HR_RESET_IN_PROG, so
> // idpf_decfg_netdev() doesn't get called.
> if (!test_bit(IDPF_HR_RESET_IN_PROG, adapter->flags))
> idpf_decfg_netdev(vport);
> // idpf_decfg_netdev() would clear netdev but it isn't called:
> unregister_netdev(vport->netdev);
> free_netdev(vport->netdev);
> vport->netdev = NULL;
> // Later in idpf_init_hard_reset(), the vport is cleared:
> kfree(adapter->vports);
> adapter->vports = NULL;
>
> During an idpf HW reset, when userspace restarts the network
> service, the vport associated with the netdev is NULL, and so a
> kernel panic would
> happen:
>
> [ 1791.669339] BUG: kernel NULL pointer dereference, address:
> 0000000000000070 ...
> [ 1791.717130] RIP: 0010:idpf_vport_stop+0x16/0x1c0
>
> This can be reproduced reliably by injecting a TX timeout to cause
> an idpf HW reset, and injecting a virtchnl error to cause the HW
> reset to fail and retry, while running "service network restart" in
> userspace.
>
> With this patch applied, we see the following error but no kernel
> panics anymore:
>
> [ 181.409483] idpf 0000:05:00.0 eth1: mtu not changed due to no
> vport innetdev RTNETLINK answers: Bad address ...
> [ 181.913644] idpf 0000:05:00.0 eth1: not stopping vport because it
> is NULL [ 181.938675] idpf 0000:05:00.0 eth1: mtu not changed due
> to no vport in netdev ...
> [ 242.849499] idpf 0000:05:00.0 eth1: not opening vport because it
> is NULL ...
> [ 304.289364] idpf 0000:05:00.0 eth0: not opening vport because it
> is NULL
>
> Signed-off-by: Li Li <boolli@...gle.com>
> ---
> drivers/net/ethernet/intel/idpf/idpf_lib.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c
> b/drivers/net/ethernet/intel/idpf/idpf_lib.c
> index 53b31989722a7..a9a556499262b 100644
> --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c
> +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c
> @@ -1021,6 +1021,8 @@ static void idpf_vport_stop(struct idpf_vport
> *vport, bool rtnl)
> */
> static int idpf_stop(struct net_device *netdev) {
> + if (!netdev)
> + return 0;
> struct idpf_netdev_priv *np = netdev_priv(netdev);
> struct idpf_vport *vport;
>
> @@ -1029,9 +1031,14 @@ static int idpf_stop(struct net_device
> *netdev)
>
> idpf_vport_ctrl_lock(netdev);
> vport = idpf_netdev_to_vport(netdev);
> + if (!vport) {
> + netdev_err(netdev, "not stopping vport because it is
> NULL");
Please don't forget to add trailing '\n'.
> + goto unlock;
> + }
>
> idpf_vport_stop(vport, false);
>
> +unlock:
> idpf_vport_ctrl_unlock(netdev);
>
> return 0;
> @@ -2301,6 +2308,11 @@ static int idpf_open(struct net_device
> *netdev)
>
> idpf_vport_ctrl_lock(netdev);
> vport = idpf_netdev_to_vport(netdev);
> + if (!vport) {
> + netdev_err(netdev, "not opening vport because it is
> NULL");
Please don't forget to add trailing '\n', here too.
> + err = -EFAULT;
> + goto unlock;
> + }
>
> err = idpf_set_real_num_queues(vport);
> if (err)
> --
> 2.52.0.351.gbe84eed79e-goog
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@...el.com>
Powered by blists - more mailing lists