[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CH3PR11MB83135C86680DFCF93D31597DEADF2@CH3PR11MB8313.namprd11.prod.outlook.com>
Date: Fri, 5 Jul 2024 13:12:26 +0000
From: "Rout, ChandanX" <chandanx.rout@...el.com>
To: "Kubiak, Michal" <michal.kubiak@...el.com>,
"intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>
CC: "Fijalkowski, Maciej" <maciej.fijalkowski@...el.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>, "Drewek, Wojciech"
<wojciech.drewek@...el.com>, "Kuruvinakunnel, George"
<george.kuruvinakunnel@...el.com>, "Keller, Jacob E"
<jacob.e.keller@...el.com>, "kuba@...nel.org" <kuba@...nel.org>,
"Kuruvinakunnel, George" <george.kuruvinakunnel@...el.com>, "Pandey, Atul"
<atul.pandey@...el.com>, "Nagraj, Shravan" <shravan.nagraj@...el.com>
Subject: RE: [PATCH iwl-net v2] i40e: Fix XDP program unloading while removing
the driver
>-----Original Message-----
>From: Kubiak, Michal <michal.kubiak@...el.com>
>Sent: Monday, June 24, 2024 8:42 PM
>To: intel-wired-lan@...ts.osuosl.org
>Cc: Fijalkowski, Maciej <maciej.fijalkowski@...el.com>;
>netdev@...r.kernel.org; Drewek, Wojciech <wojciech.drewek@...el.com>;
>Kuruvinakunnel, George <george.kuruvinakunnel@...el.com>; Keller, Jacob E
><jacob.e.keller@...el.com>; kuba@...nel.org; Kubiak, Michal
><michal.kubiak@...el.com>
>Subject: [PATCH iwl-net v2] i40e: Fix XDP program unloading while removing
>the driver
>
>The commit 6533e558c650 ("i40e: Fix reset path while removing the driver")
>introduced a new PF state "__I40E_IN_REMOVE" to block modifying the XDP
>program while the driver is being removed.
>Unfortunately, such a change is useful only if the ".ndo_bpf()"
>callback was called out of the rmmod context because unloading the existing
>XDP program is also a part of driver removing procedure.
>In other words, from the rmmod context the driver is expected to unload the
>XDP program without reporting any errors. Otherwise, the kernel warning with
>callstack is printed out to dmesg.
>
>Example failing scenario:
> 1. Load the i40e driver.
> 2. Load the XDP program.
> 3. Unload the i40e driver (using "rmmod" command).
>
>The example kernel warning log:
>
>[ +0.004646] WARNING: CPU: 94 PID: 10395 at net/core/dev.c:9290
>unregister_netdevice_many_notify+0x7a9/0x870
>[...]
>[ +0.010959] RIP: 0010:unregister_netdevice_many_notify+0x7a9/0x870
>[...]
>[ +0.002726] Call Trace:
>[ +0.002457] <TASK>
>[ +0.002119] ? __warn+0x80/0x120
>[ +0.003245] ? unregister_netdevice_many_notify+0x7a9/0x870
>[ +0.005586] ? report_bug+0x164/0x190
>[ +0.003678] ? handle_bug+0x3c/0x80
>[ +0.003503] ? exc_invalid_op+0x17/0x70 [ +0.003846] ?
>asm_exc_invalid_op+0x1a/0x20 [ +0.004200] ?
>unregister_netdevice_many_notify+0x7a9/0x870
>[ +0.005579] ? unregister_netdevice_many_notify+0x3cc/0x870
>[ +0.005586] unregister_netdevice_queue+0xf7/0x140
>[ +0.004806] unregister_netdev+0x1c/0x30 [ +0.003933]
>i40e_vsi_release+0x87/0x2f0 [i40e] [ +0.004604] i40e_remove+0x1a1/0x420
>[i40e] [ +0.004220] pci_device_remove+0x3f/0xb0 [ +0.003943]
>device_release_driver_internal+0x19f/0x200
>[ +0.005243] driver_detach+0x48/0x90
>[ +0.003586] bus_remove_driver+0x6d/0xf0 [ +0.003939]
>pci_unregister_driver+0x2e/0xb0 [ +0.004278] i40e_exit_module+0x10/0x5f0
>[i40e] [ +0.004570] __do_sys_delete_module.isra.0+0x197/0x310
>[ +0.005153] do_syscall_64+0x85/0x170
>[ +0.003684] ? syscall_exit_to_user_mode+0x69/0x220
>[ +0.004886] ? do_syscall_64+0x95/0x170 [ +0.003851] ?
>exc_page_fault+0x7e/0x180 [ +0.003932]
>entry_SYSCALL_64_after_hwframe+0x71/0x79
>[ +0.005064] RIP: 0033:0x7f59dc9347cb
>[ +0.003648] Code: 73 01 c3 48 8b 0d 65 16 0c 00 f7 d8 64 89 01 48 83
>c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f
>05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 35 16 0c 00 f7 d8 64 89 01 48 [
>+0.018753] RSP: 002b:00007ffffac99048 EFLAGS: 00000206 ORIG_RAX:
>00000000000000b0 [ +0.007577] RAX: ffffffffffffffda RBX:
>0000559b9bb2f6e0 RCX: 00007f59dc9347cb [ +0.007140] RDX:
>0000000000000000 RSI: 0000000000000800 RDI: 0000559b9bb2f748 [
>+0.007146] RBP: 00007ffffac99070 R08: 1999999999999999 R09:
>0000000000000000 [ +0.007133] R10: 00007f59dc9a5ac0 R11:
>0000000000000206 R12: 0000000000000000 [ +0.007141] R13:
>00007ffffac992d8 R14: 0000559b9bb2f6e0 R15: 0000000000000000 [
>+0.007151] </TASK> [ +0.002204] ---[ end trace 0000000000000000 ]---
>
>Fix this by checking if the XDP program is being loaded or unloaded.
>Then, block only loading a new program while "__I40E_IN_REMOVE" is set.
>Also, move testing "__I40E_IN_REMOVE" flag to the beginning of XDP_SETUP
>callback to avoid unnecessary operations and checks.
>
>Fixes: 6533e558c650 ("i40e: Fix reset path while removing the driver")
>Signed-off-by: Michal Kubiak <michal.kubiak@...el.com>
>
>---
>
>v1 -> v2 changes:
> - simplify the fix according to Kuba's suggestions, i.e. remove
> checking 'NETREG_UNREGISTERING' flag directly from ndo_bpf
> and remove a separate handling for 'unregister' context.
> - update the commit message accordingly
> - add an example of the kernel warning for the issue being fixed.
>
>See:
>v1: <https://lore.kernel.org/netdev/20240528-net-2024-05-28-intel-net-
>fixes-v1-4-dc8593d2bbc6@...el.com/>
>---
> drivers/net/ethernet/intel/i40e/i40e_main.c | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)
>
Tested-by: Chandan Kumar Rout <chandanx.rout@...el.com> (A Contingent Worker at Intel)
Powered by blists - more mailing lists