[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <72975a9c-0daf-4100-b31a-cee0f52e2514@linux.intel.com>
Date: Thu, 13 Feb 2025 13:32:38 +0100
From: Marcin Szycik <marcin.szycik@...ux.intel.com>
To: Simon Horman <horms@...nel.org>
Cc: intel-wired-lan@...ts.osuosl.org, netdev@...r.kernel.org,
michal.swiatkowski@...ux.intel.com,
Sujai Buvaneswaran <sujai.buvaneswaran@...el.com>,
Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@...ux.intel.com>
Subject: Re: [PATCH iwl-net 1/2] ice: Fix deinitializing VF in error path
On 13.02.2025 11:55, Simon Horman wrote:
> On Tue, Feb 11, 2025 at 06:43:21PM +0100, Marcin Szycik wrote:
>> If ice_ena_vfs() fails after calling ice_create_vf_entries(), it frees
>> all VFs without removing them from snapshot PF-VF mailbox list, leading
>> to list corruption.
>>
>> Reproducer:
>> devlink dev eswitch set $PF1_PCI mode switchdev
>> ip l s $PF1 up
>> ip l s $PF1 promisc on
>> sleep 1
>> echo 1 > /sys/class/net/$PF1/device/sriov_numvfs
>
> Should the line above be "echo 0" to remove the VFs before creating VFs
> below (I'm looking at sriov_numvfs_store())?
Both "echo 1" commands fail (I'm fixing it in patch 2/2), that's why there's
no "echo 0" in between. Also, in this minimal example I'm assuming no VFs
were initially present.
Thanks for reviewing!
Marcin
>> sleep 1
>> echo 1 > /sys/class/net/$PF1/device/sriov_numvfs
>>
>> Trace (minimized):
>> list_add corruption. next->prev should be prev (ffff8882e241c6f0), but was 0000000000000000. (next=ffff888455da1330).
>> kernel BUG at lib/list_debug.c:29!
>> RIP: 0010:__list_add_valid_or_report+0xa6/0x100
>> ice_mbx_init_vf_info+0xa7/0x180 [ice]
>> ice_initialize_vf_entry+0x1fa/0x250 [ice]
>> ice_sriov_configure+0x8d7/0x1520 [ice]
>> ? __percpu_ref_switch_mode+0x1b1/0x5d0
>> ? __pfx_ice_sriov_configure+0x10/0x10 [ice]
>>
>> Sometimes a KASAN report can be seen instead with a similar stack trace:
>> BUG: KASAN: use-after-free in __list_add_valid_or_report+0xf1/0x100
>>
>> VFs are added to this list in ice_mbx_init_vf_info(), but only removed
>> in ice_free_vfs(). Move the removing to ice_free_vf_entries(), which is
>> also being called in other places where VFs are being removed (including
>> ice_free_vfs() itself).
>>
>> Fixes: 8cd8a6b17d27 ("ice: move VF overflow message count into struct ice_mbx_vf_info")
>> Reported-by: Sujai Buvaneswaran <sujai.buvaneswaran@...el.com>
>> Closes: https://lore.kernel.org/intel-wired-lan/PH0PR11MB50138B635F2E5CEB7075325D961F2@PH0PR11MB5013.namprd11.prod.outlook.com
>> Reviewed-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@...ux.intel.com>
>> Signed-off-by: Marcin Szycik <marcin.szycik@...ux.intel.com>
>
> The comment above notwithstanding, I agree that this addresses the
> bug you have described.
>
> Reviewed-by: Simon Horman <horms@...nel.org>
>
Powered by blists - more mailing lists