[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250320174808.GG892515@horms.kernel.org>
Date: Thu, 20 Mar 2025 17:48:08 +0000
From: Simon Horman <horms@...nel.org>
To: Emil Tantilov <emil.s.tantilov@...el.com>
Cc: intel-wired-lan@...ts.osuosl.org, netdev@...r.kernel.org,
decot@...gle.com, willemb@...gle.com, anthony.l.nguyen@...el.com,
davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, madhu.chittim@...el.com,
Aleksandr.Loktionov@...el.com, yuma@...hat.com, mschmidt@...hat.com,
michal.swiatkowski@...ux.intel.com
Subject: Re: [PATCH iwl-net v2] idpf: fix adapter NULL pointer dereference on
reboot
On Mon, Mar 17, 2025 at 10:42:02PM -0700, Emil Tantilov wrote:
> With SRIOV enabled, idpf ends up calling into idpf_remove() twice.
> First via idpf_shutdown() and then again when idpf_remove() calls into
> sriov_disable(), because the VF devices use the idpf driver, hence the
> same remove routine. When that happens, it is possible for the adapter
> to be NULL from the first call to idpf_remove(), leading to a NULL
> pointer dereference.
>
> echo 1 > /sys/class/net/<netif>/device/sriov_numvfs
> reboot
>
> BUG: kernel NULL pointer dereference, address: 0000000000000020
> ...
> RIP: 0010:idpf_remove+0x22/0x1f0 [idpf]
> ...
> ? idpf_remove+0x22/0x1f0 [idpf]
> ? idpf_remove+0x1e4/0x1f0 [idpf]
> pci_device_remove+0x3f/0xb0
> device_release_driver_internal+0x19f/0x200
> pci_stop_bus_device+0x6d/0x90
> pci_stop_and_remove_bus_device+0x12/0x20
> pci_iov_remove_virtfn+0xbe/0x120
> sriov_disable+0x34/0xe0
> idpf_sriov_configure+0x58/0x140 [idpf]
> idpf_remove+0x1b9/0x1f0 [idpf]
> idpf_shutdown+0x12/0x30 [idpf]
> pci_device_shutdown+0x35/0x60
> device_shutdown+0x156/0x200
> ...
>
> Replace the direct idpf_remove() call in idpf_shutdown() with
> idpf_vc_core_deinit() and idpf_deinit_dflt_mbx(), which perform
> the bulk of the cleanup, such as stopping the init task, freeing IRQs,
> destroying the vports and freeing the mailbox. This avoids the calls to
> sriov_disable() in addition to a small netdev cleanup, and destroying
> workqueues, which don't seem to be required on shutdown.
>
> Reported-by: Yuying Ma <yuma@...hat.com>
> Fixes: e850efed5e15 ("idpf: add module register and probe functionality")
> Reviewed-by: Madhu Chittim <madhu.chittim@...el.com>
> Signed-off-by: Emil Tantilov <emil.s.tantilov@...el.com>
> ---
> Changelog:
> v2:
> - Updated the description to clarify the path leading up to the crash,
> and the difference in the logic between remove and shutdown as result
> of this change.
>
> v1:
> https://lore.kernel.org/intel-wired-lan/20250307003956.22018-1-emil.s.tantilov@intel.com/
Thanks for the update.
Reviewed-by: Simon Horman <horms@...nel.org>
Powered by blists - more mailing lists