lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAODvEq5L9dBHAfmhATtXmuUde7My1wCobMN1JRvACDKPwa3XRQ@mail.gmail.com>
Date: Wed, 7 Jan 2026 10:40:10 -0800
From: Li Li <boolli@...gle.com>
To: "Tantilov, Emil S" <emil.s.tantilov@...el.com>
Cc: Tony Nguyen <anthony.l.nguyen@...el.com>, 
	Przemek Kitszel <przemyslaw.kitszel@...el.com>, "David S. Miller" <davem@...emloft.net>, 
	Jakub Kicinski <kuba@...nel.org>, Eric Dumazet <edumazet@...gle.com>, intel-wired-lan@...ts.osuosl.org, 
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org, 
	David Decotigny <decot@...gle.com>, Anjali Singhai <anjali.singhai@...el.com>, 
	Sridhar Samudrala <sridhar.samudrala@...el.com>, Brian Vazquez <brianvv@...gle.com>
Subject: Re: [Intel-wired-lan] [PATCH 1/5] idpf: skip getting/setting ring
 params if vport is NULL during HW reset

Please reject this patch series given the underlying issue is fixed in
an earlier patch
series already, thanks.

On Wed, Jan 7, 2026 at 9:41 AM Tantilov, Emil S
<emil.s.tantilov@...el.com> wrote:
>
>
>
> On 1/6/2026 5:04 PM, Li Li via Intel-wired-lan wrote:
> > When an idpf HW reset is triggered, it clears the vport but does
> > not clear the netdev held by vport:
> >
> >      // In idpf_vport_dealloc() called by idpf_init_hard_reset(),
> >      // idpf_init_hard_reset() sets IDPF_HR_RESET_IN_PROG, so
> >      // idpf_decfg_netdev() doesn't get called.
> >      if (!test_bit(IDPF_HR_RESET_IN_PROG, adapter->flags))
> >          idpf_decfg_netdev(vport);
> >      // idpf_decfg_netdev() would clear netdev but it isn't called:
> >      unregister_netdev(vport->netdev);
> >      free_netdev(vport->netdev);
> >      vport->netdev = NULL;
> >      // Later in idpf_init_hard_reset(), the vport is cleared:
> >      kfree(adapter->vports);
> >      adapter->vports = NULL;
> >
> > During an idpf HW reset, when "ethtool -g/-G" is called on the netdev,
> > the vport associated with the netdev is NULL, and so a kernel panic
> > would happen:
> >
> > [  513.185327] BUG: kernel NULL pointer dereference, address: 0000000000000038
> > ...
> > [  513.232756] RIP: 0010:idpf_get_ringparam+0x45/0x80
> >
> > This can be reproduced reliably by injecting a TX timeout to cause
> > an idpf HW reset, and injecting a virtchnl error to cause the HW
> > reset to fail and retry, while calling "ethtool -g/-G" on the netdev
> > at the same time.
>
> I have posted series that resolves these issues in the reset path by
> reshuffling the flow a bit and adding netif_device_detach/attach to
> make sure the netdevs are better protected in the middle of a reset:
> https://lore.kernel.org/intel-wired-lan/20251121001218.4565-1-emil.s.tantilov@intel.com/
>
> If you are still seeing issues with the above applied, let me know and I
> can take a look.

Thanks Emil! Yes I performed the experiment at a commit past your
patch series above, and it
does look like the kernel panic does appear anymore. Now performing
ethtool commands during
HW resets would result in "netlink error: No such device", which is
expected because we are detaching
the netdev at the start of the HW reset.

Please reject this patch series, thanks!

>
> >
> > With this patch applied, we see the following error but no kernel
> > panics anymore:
> >
> > [  476.323630] idpf 0000:05:00.0 eth1: failed to get ring params due to no vport in netdev
> >
> > Signed-off-by: Li Li <boolli@...gle.com>
> > ---
> >   drivers/net/ethernet/intel/idpf/idpf_ethtool.c | 12 ++++++++++++
> >   1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
> > index d5711be0b8e69..6a4b630b786c2 100644
> > --- a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
> > +++ b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
> > @@ -639,6 +638,10 @@ static void idpf_get_ringparam(struct net_device *netdev,
> >
> >       idpf_vport_ctrl_lock(netdev);
> >       vport = idpf_netdev_to_vport(netdev);
> > +     if (!vport) {
>
> We used to have these all over the place, but the code was changed to
> rely on idpf_vport_ctrl_lock() for the protection of the vport state.
> Still some issues remain with the error paths (hence the series above),
> but in general we don't want to resort to vport NULL checks and rather
> fix the reset flows to rely on cleaner logic and locks.
>
> Thanks,
> Emil
>
> > +             netdev_err(netdev, "failed to get ring params due to no vport in netdev\n");
> > +             goto unlock;
> > +     }
> >
> >       ring->rx_max_pending = IDPF_MAX_RXQ_DESC;
> >       ring->tx_max_pending = IDPF_MAX_TXQ_DESC;
> > @@ -647,6 +651,7 @@ static void idpf_get_ringparam(struct net_device *netdev,
> >
> >       kring->tcp_data_split = idpf_vport_get_hsplit(vport);
> >
> > +unlock:
> >       idpf_vport_ctrl_unlock(netdev);
> >   }
> >
> > @@ -673,6 +674,11 @@ static int idpf_set_ringparam(struct net_device *netdev,
> >
> >       idpf_vport_ctrl_lock(netdev);
> >       vport = idpf_netdev_to_vport(netdev);
> > +     if (!vport) {
> > +             netdev_err(netdev, "ring params not changed due to no vport in netdev\n");
> > +             err = -EFAULT;
> > +             goto unlock_mutex;
> > +     }
> >
> >       idx = vport->idx;
> >
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ