[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9fa390cf-526e-4ac8-82e8-99f616b807b6@intel.com>
Date: Mon, 10 Mar 2025 21:52:30 -0700
From: "Tantilov, Emil S" <emil.s.tantilov@...el.com>
To: Michal Swiatkowski <michal.swiatkowski@...ux.intel.com>
CC: <intel-wired-lan@...ts.osuosl.org>, <netdev@...r.kernel.org>,
<decot@...gle.com>, <willemb@...gle.com>, <anthony.l.nguyen@...el.com>,
<davem@...emloft.net>, <edumazet@...gle.com>, <kuba@...nel.org>,
<pabeni@...hat.com>, <madhu.chittim@...el.com>,
<Aleksandr.Loktionov@...el.com>, <yuma@...hat.com>, <mschmidt@...hat.com>,
Simon Horman <horms@...nel.org>
Subject: Re: [PATCH iwl-net] idpf: fix adapter NULL pointer dereference on
reboot
On 3/6/2025 9:58 PM, Michal Swiatkowski wrote:
> On Thu, Mar 06, 2025 at 04:39:56PM -0800, Emil Tantilov wrote:
>> Driver calls idpf_remove() from idpf_shutdown(), which can end up
>> calling idpf_remove() again when disabling SRIOV.
>>
>
> The same is done in other drivers (ice, iavf). Why here it is a problem?
> I am asking because heaving one function to remove is pretty handy.
> Maybe the problem can be fixed by some changes in idpf_remove() instead?
It was indeed handy, until we ran into the crash. I did look into fixing
it in idpf_remove(), but I don't think I have a lot of options. I can
simply check and exit on adapter being NULL, but this types of checks
are usually frowned upon, so I looked into alternatives.
The main difference between idpf and ice is that idpf will load on both
VF and PF devices. From what I can tell, the VFs created by ice are
supported by iavf (0x1889 device id). With VFs created, on idpf, we end
up calling into idpf_remove() twice. First on shutdown and then again
when idpf_remove calls into sriov_disable(), because the VF devices have
the same driver, hence the same remove routine.
>
>> echo 1 > /sys/class/net/<netif>/device/sriov_numvfs
>> reboot
>>
>> BUG: kernel NULL pointer dereference, address: 0000000000000020
>> ...
>> RIP: 0010:idpf_remove+0x22/0x1f0 [idpf]
>> ...
>> ? idpf_remove+0x22/0x1f0 [idpf]
>> ? idpf_remove+0x1e4/0x1f0 [idpf]
>> pci_device_remove+0x3f/0xb0
>> device_release_driver_internal+0x19f/0x200
>> pci_stop_bus_device+0x6d/0x90
>> pci_stop_and_remove_bus_device+0x12/0x20
>> pci_iov_remove_virtfn+0xbe/0x120
>> sriov_disable+0x34/0xe0
>> idpf_sriov_configure+0x58/0x140 [idpf]
>> idpf_remove+0x1b9/0x1f0 [idpf]
>> idpf_shutdown+0x12/0x30 [idpf]
>> pci_device_shutdown+0x35/0x60
>> device_shutdown+0x156/0x200
>> ...
>>
>> Replace the direct idpf_remove() call in idpf_shutdown() with
>> idpf_vc_core_deinit() and idpf_deinit_dflt_mbx(), which perform
>> the bulk of the cleanup, such as stopping the init task, freeing IRQs,
>> destroying the vports and freeing the mailbox.
>>
>> Reported-by: Yuying Ma <yuma@...hat.com>
>> Fixes: e850efed5e15 ("idpf: add module register and probe functionality")
>> Reviewed-by: Madhu Chittim <madhu.chittim@...el.com>
>> Signed-off-by: Emil Tantilov <emil.s.tantilov@...el.com>
>> ---
>> drivers/net/ethernet/intel/idpf/idpf_main.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/intel/idpf/idpf_main.c b/drivers/net/ethernet/intel/idpf/idpf_main.c
>> index b6c515d14cbf..bec4a02c5373 100644
>> --- a/drivers/net/ethernet/intel/idpf/idpf_main.c
>> +++ b/drivers/net/ethernet/intel/idpf/idpf_main.c
>> @@ -87,7 +87,11 @@ static void idpf_remove(struct pci_dev *pdev)
>> */
>> static void idpf_shutdown(struct pci_dev *pdev)
>> {
>> - idpf_remove(pdev);
>> + struct idpf_adapter *adapter = pci_get_drvdata(pdev);
>> +
>> + cancel_delayed_work_sync(&adapter->vc_event_task);
>> + idpf_vc_core_deinit(adapter);
>> + idpf_deinit_dflt_mbx(adapter);
>>
>> if (system_state == SYSTEM_POWER_OFF)
>> pci_set_power_state(pdev, PCI_D3hot);
>> --
>> 2.17.2
Powered by blists - more mailing lists