[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <BL0PR11MB3122D70ABDE6C2ACEE376073BD90A@BL0PR11MB3122.namprd11.prod.outlook.com>
Date: Mon, 18 Dec 2023 05:30:56 +0000
From: "Pucha, HimasekharX Reddy" <himasekharx.reddy.pucha@...el.com>
To: "Chmielewski, Pawel" <pawel.chmielewski@...el.com>,
"intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>
CC: "Kwan, Ngai-mint" <ngai-mint.kwan@...el.com>, "netdev@...r.kernel.org"
<netdev@...r.kernel.org>, "Chmielewski, Pawel" <pawel.chmielewski@...el.com>,
"Polchlopek, Mateusz" <mateusz.polchlopek@...el.com>
Subject: RE: [Intel-wired-lan] [PATCH iwl-net v2] ice: Do not get coalesce
settings while in reset
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@...osl.org> On Behalf Of Pawel Chmielewski
> Sent: Wednesday, December 6, 2023 11:10 PM
> To: intel-wired-lan@...ts.osuosl.org
> Cc: Kwan, Ngai-mint <ngai-mint.kwan@...el.com>; netdev@...r.kernel.org; Chmielewski, Pawel <pawel.chmielewski@...el.com>; Polchlopek, Mateusz <mateusz.polchlopek@...el.com>
> Subject: [Intel-wired-lan] [PATCH iwl-net v2] ice: Do not get coalesce settings while in reset
>
> From: Ngai-Mint Kwan <ngai-mint.kwan@...el.com>
>
> Getting coalesce settings while reset is in progress can cause NULL
> pointer deference bug.
> If under reset, abort get coalesce for ethtool.
>
> Fixes: 67fe64d78c437 ("ice: Implement getting and setting ethtool coalesce")
> Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@...el.com>
> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@...el.com>
> Signed-off-by: Pawel Chmielewski <pawel.chmielewski@...el.com>
>---
> Changes since v1:
> * Added "Fixes:" tag
> * targeting iwl-net instead of iwl-next
> ---
> ---
> drivers/net/ethernet/intel/ice/ice_ethtool.c | 3 +++
> 1 file changed, 3 insertions(+)
After applying the patch observing new crash.
Reproduction steps:
#while true; do ethtool -c eth0; done
#echo 1 > /sys/bus/pci/devices/0000\:18\:00.0/reset
[Dec12 00:12] ice 0000:18:00.0: PTP reset successful
[ +0.859959] ------------[ cut here ]------------
[ +0.000002] RTNL: assertion failed at net/core/dev.c (6422)
[ +0.000017] WARNING: CPU: 88 PID: 539037 at net/core/dev.c:6422 netif_queue_set_napi+0xba/0xd0
[ +0.000008] Modules linked in: irdma ice snd_seq_dummy snd_hrtimer snd_seq snd_timer snd_seq_device snd soundcore qrtr rfkill vfat fat xfs libcrc32c rpcrdma sunrpc rdma_ucm ib_srpt intel_rapl_msr intel_rapl_common ib_isert intel_uncore_frequency intel_uncore_frequency_common iscsi_target_mod target_core_mod isst_if_common skx_edac nfit ib_iser libnvdimm libiscsi scsi_transport_iscsi x86_pkg_temp_thermal intel_powerclamp rdma_cm coretemp iw_cm ib_cm kvm_intel ipmi_ssif kvm irqbypass rapl intel_cstate iTCO_wdt iTCO_vendor_support ib_uverbs intel_uncore acpi_ipmi mei_me i2c_i801 ipmi_si pcspkr ib_core mei i2c_smbus lpc_ich ipmi_devintf intel_pch_thermal joydev ioatdma ipmi_msghandler acpi_power_meter acpi_pad ext4 mbcache jbd2 ast drm_shmem_helper drm_kms_helper sd_mod t10_pi sg ixgbe drm crct10dif_pclmul i40e crc32_pclmul ahci crc32c_intel libahci igb ghash_clmulni_intel libata i2c_algo_bit mdio dca gnss wmi fuse [last unloaded: ice]
[ +0.000054] CPU: 88 PID: 539037 Comm: bash Kdump: loaded Not tainted 6.7.0-rc4_next-queue_11th_Dec-2023-00891-g9615a96563f0 #1
[ +0.000003] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020
[ +0.000001] RIP: 0010:netif_queue_set_napi+0xba/0xd0
[ +0.000003] Code: 75 9e 80 3d d3 ba 2c 01 00 75 95 ba 16 19 00 00 48 c7 c6 fc 85 27 85 48 c7 c7 10 25 1c 85 c6 05 b7 ba 2c 01 01 e8 c6 cf 6a ff <0f> 0b e9 6f ff ff ff 0f 0b 5b 5d 41 5c 41 5d c3 cc cc cc cc 66 90
[ +0.000001] RSP: 0018:ffffc9002d827b30 EFLAGS: 00010282
[ +0.000002] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000027
[ +0.000001] RDX: ffff88980fc1f8c8 RSI: 0000000000000001 RDI: ffff88980fc1f8c0
[ +0.000001] RBP: ffff888c984dd010 R08: 0000000000000000 R09: 00000000ffff7fff
[ +0.000001] R10: ffffc9002d8279d0 R11: ffffffff857e6648 R12: 0000000000000000
[ +0.000001] R13: ffff8881362e8000 R14: ffff888c984dd010 R15: 0000000000000000
[ +0.000001] FS: 00007fdbde01d740(0000) GS:ffff88980fc00000(0000) knlGS:0000000000000000
[ +0.000002] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000001] CR2: 00007f7358c89000 CR3: 0000000107fcc006 CR4: 00000000007706f0
[ +0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ +0.000001] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ +0.000001] PKRU: 55555554
[ +0.000001] Call Trace:
[ +0.000001] <TASK>
[ +0.000002] ? __warn+0x80/0x130
[ +0.000004] ? netif_queue_set_napi+0xba/0xd0
[ +0.000002] ? report_bug+0x195/0x1a0
[ +0.000004] ? prb_read_valid+0x17/0x20
[ +0.000004] ? handle_bug+0x3c/0x70
[ +0.000005] ? exc_invalid_op+0x14/0x70
[ +0.000001] ? asm_exc_invalid_op+0x16/0x20
[ +0.000005] ? netif_queue_set_napi+0xba/0xd0
[ +0.000003] ice_q_vector_set_napi_queues+0x37/0xf0 [ice]
[ +0.000072] ice_vsi_cfg_def+0x423/0x830 [ice]
[ +0.000043] ice_vsi_rebuild+0x238/0x3c0 [ice]
[ +0.000042] ice_vsi_rebuild_by_type+0x76/0x180 [ice]
[ +0.000033] ice_rebuild+0x191/0x510 [ice]
[ +0.000041] ice_do_reset+0xa3/0x190 [ice]
[ +0.000056] ice_pci_err_resume+0x3b/0xb0 [ice]
[ +0.000035] pci_reset_function+0x48/0x70
[ +0.000005] reset_store+0x57/0xa0
[ +0.000004] kernfs_fop_write_iter+0x128/0x1c0
[ +0.000004] vfs_write+0x2ac/0x3c0
[ +0.000003] ksys_write+0x5f/0xe0
[ +0.000002] do_syscall_64+0x5c/0xe0
[ +0.000003] ? do_user_addr_fault+0x336/0x680
[ +0.000006] ? exc_page_fault+0x65/0x150
[ +0.000003] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ +0.000003] RIP: 0033:0x7fdbddf3eb97
[ +0.000002] Code: 0b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[ +0.000001] RSP: 002b:00007ffdfc92bda8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ +0.000002] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fdbddf3eb97
[ +0.000001] RDX: 0000000000000002 RSI: 000055af480778a0 RDI: 0000000000000001
[ +0.000001] RBP: 000055af480778a0 R08: 0000000000000000 R09: 00007fdbddfb14e0
[ +0.000001] R10: 00007fdbddfb13e0 R11: 0000000000000246 R12: 0000000000000002
[ +0.000002] R13: 00007fdbddffb780 R14: 0000000000000002 R15: 00007fdbddff69e0
[ +0.000002] </TASK>
[ +0.000001] ---[ end trace 0000000000000000 ]---
[ +0.104086] ice 0000:18:00.0: VSI rebuilt. VSI index 0, type ICE_VSI_PF
[ +0.003689] ice 0000:18:00.0: VSI rebuilt. VSI index 1, type ICE_VSI_CTRL
Crash Without patch:
[ 251.069061] BUG: kernel NULL pointer dereference, address: 0000000000000028
[ 251.069065] #PF: supervisor read access in kernel mode
[ 251.069067] #PF: error_code(0x0000) - not-present page
[ 251.069069] PGD 0 P4D 0
[ 251.069072] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 251.069075] CPU: 3 PID: 20728 Comm: ethtool Kdump: loaded Not tainted 6.7.0-rc3_next-queue_4th-Dec-2023-00732-gda7b4d5ccb44 #1
[ 251.069078] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020
[ 251.069080] RIP: 0010:ice_get_q_coalesce+0x2e/0xa0 [ice]
[ 251.069158] Code: 00 55 53 48 89 fb 48 89 f7 48 83 ec 08 0f b7 8b 96 04 00 00 0f b7 83 92 04 00 00 39 d1 7e 30 48 8b 4b 20 48 63 ea 48 8b 0c e9 <48> 8b 71 28 48 81 c6 98 01 00 00 39 c2 7c 32 e8 fe fe ff ff 85 c0
[ 251.069160] RSP: 0018:ffffc900343af980 EFLAGS: 00010206
[ 251.069162] RAX: 0000000000000060 RBX: ffff888121c39028 RCX: 0000000000000000
[ 251.069164] RDX: 0000000000000000 RSI: ffff888106062d88 RDI: ffff888106062d88
[ 251.069165] RBP: 0000000000000000 R08: 0000000038687465 R09: 0000000000000000
[ 251.069167] R10: ffff888106062d80 R11: 0000000000000002 R12: 0000000000000000
[ 251.069168] R13: ffffc900343afa30 R14: 0000000000000013 R15: ffff888106062d80
[ 251.069169] FS: 00007f3901af2740(0000) GS:ffff888c106c0000(0000) knlGS:0000000000000000
[ 251.069171] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 251.069173] CR2: 0000000000000028 CR3: 000000029e2e2006 CR4: 00000000007706f0
[ 251.069174] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 251.069175] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 251.069177] PKRU: 55555554
[ 251.069178] Call Trace:
[ 251.069180] <TASK>
[ 251.069181] ? __die+0x20/0x70
[ 251.069187] ? page_fault_oops+0x76/0x170
[ 251.069191] ? exc_page_fault+0x65/0x150
[ 251.069195] ? asm_exc_page_fault+0x22/0x30
[ 251.069199] ? ice_get_q_coalesce+0x2e/0xa0 [ice]
[ 251.069258] ice_get_coalesce+0x13/0x30 [ice]
[ 251.069313] coalesce_prepare_data+0x59/0x80
[ 251.069318] ethnl_default_doit+0xf6/0x340
[ 251.069322] ? genl_family_rcv_msg_attrs_parse.constprop.0+0x8f/0xf0
[ 251.069326] genl_family_rcv_msg_doit+0xd9/0x130
[ 251.069329] genl_family_rcv_msg+0x14d/0x220
[ 251.069332] ? __pfx_ethnl_default_doit+0x10/0x10
[ 251.069336] genl_rcv_msg+0x47/0xa0
[ 251.069338] ? __pfx_genl_rcv_msg+0x10/0x10
[ 251.069341] netlink_rcv_skb+0x54/0x100
[ 251.069344] genl_rcv+0x24/0x40
[ 251.069346] netlink_unicast+0x243/0x360
[ 251.069349] netlink_sendmsg+0x206/0x450
[ 251.069352] __sys_sendto+0x1fe/0x210
[ 251.069355] ? ___sys_recvmsg+0x88/0xd0
[ 251.069359] ? __sys_recvmsg+0x56/0xa0
[ 251.069363] __x64_sys_sendto+0x20/0x30
[ 251.069365] do_syscall_64+0x5c/0xe0
[ 251.069369] ? syscall_exit_work+0x103/0x130
[ 251.069374] ? syscall_exit_to_user_mode+0x22/0x40
[ 251.069376] ? do_syscall_64+0x6b/0xe0
[ 251.069379] ? __count_memcg_events+0x3e/0x90
[ 251.069383] ? mm_account_fault+0x6c/0x100
[ 251.069387] ? handle_mm_fault+0xd8/0x210
[ 251.069389] ? do_user_addr_fault+0x336/0x680
[ 251.069392] ? exc_page_fault+0x65/0x150
[ 251.069394] entry_SYSCALL_64_after_hwframe+0x6e/0x76
[ 251.069396] RIP: 0033:0x7f390194fa9a
[ 251.069398] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
[ 251.069400] RSP: 002b:00007ffd67aab4e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[ 251.069403] RAX: ffffffffffffffda RBX: 000055be8b68b340 RCX: 00007f390194fa9a
[ 251.069404] RDX: 0000000000000024 RSI: 000055be8b68b3b0 RDI: 0000000000000003
[ 251.069405] RBP: 000055be8b68b3b0 R08: 00007f3901af9200 R09: 000000000000000c
[ 251.069407] R10: 0000000000000000 R11: 0000000000000246 R12: 000055be898b4e10
[ 251.069408] R13: 0000000000000000 R14: 000055be8b68b2a0 R15: 0000000000000000
[ 251.069410] </TASK>
[ 251.069411] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_timer snd_seq_device snd soundcore qrtr rfkill vfat fat xfs libcrc32c rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common target_core_mod isst_if_common ib_iser skx_edac nfit libiscsi libnvdimm scsi_transport_iscsi rdma_cm x86_pkg_temp_thermal intel_powerclamp coretemp iw_cm ib_cm kvm_intel kvm ipmi_ssif irqbypass irdma rapl intel_cstate ib_uverbs iTCO_wdt iTCO_vendor_support intel_uncore mei_me acpi_ipmi ipmi_si i2c_i801 pcspkr ib_core mei i2c_smbus lpc_ich ipmi_devintf intel_pch_thermal ioatdma joydev ipmi_msghandler acpi_power_meter acpi_pad ext4 mbcache jbd2 ast drm_shmem_helper drm_kms_helper sd_mod t10_pi sg ice ixgbe drm i40e ahci crct10dif_pclmul libahci igb crc32_pclmul crc32c_intel ghash_clmulni_intel libata i2c_algo_bit mdio dca gnss wmi fuse
Powered by blists - more mailing lists