lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM6PR11MB4657209FFC300E207E600F3F9BAFA@DM6PR11MB4657.namprd11.prod.outlook.com>
Date: Thu, 9 Nov 2023 23:35:43 +0000
From: "Kubalewski, Arkadiusz" <arkadiusz.kubalewski@...el.com>
To: Jiri Pirko <jiri@...nulli.us>
CC: Vadim Fedorenko <vadim.fedorenko@...ux.dev>, "netdev@...r.kernel.org"
	<netdev@...r.kernel.org>, "Michalik, Michal" <michal.michalik@...el.com>,
	"Olech, Milena" <milena.olech@...el.com>, "pabeni@...hat.com"
	<pabeni@...hat.com>, "kuba@...nel.org" <kuba@...nel.org>
Subject: RE: [PATCH net 0/3] dpll: fix unordered unbind/bind registerer issues

>From: Jiri Pirko <jiri@...nulli.us>
>Sent: Thursday, November 9, 2023 7:07 PM
>
>Thu, Nov 09, 2023 at 06:20:14PM CET, arkadiusz.kubalewski@...el.com wrote:
>>>From: Vadim Fedorenko <vadim.fedorenko@...ux.dev>
>>>Sent: Thursday, November 9, 2023 11:51 AM
>>>
>>>On 08/11/2023 10:32, Arkadiusz Kubalewski wrote:
>>>> Fix issues when performing unordered unbind/bind of a kernel modules
>>>> which are using a dpll device with DPLL_PIN_TYPE_MUX pins.
>>>> Currently only serialized bind/unbind of such use case works, fix
>>>> the issues and allow for unserialized kernel module bind order.
>>>>
>>>> The issues are observed on the ice driver, i.e.,
>>>>
>>>> $ echo 0000:af:00.0 > /sys/bus/pci/drivers/ice/unbind
>>>> $ echo 0000:af:00.1 > /sys/bus/pci/drivers/ice/unbind
>>>>
>>>> results in:
>>>>
>>>> ice 0000:af:00.0: Removed PTP clock
>>>> BUG: kernel NULL pointer dereference, address: 0000000000000010
>>>> PF: supervisor read access in kernel mode
>>>> PF: error_code(0x0000) - not-present page
>>>> PGD 0 P4D 0
>>>> Oops: 0000 [#1] PREEMPT SMP PTI
>>>> CPU: 7 PID: 71848 Comm: bash Kdump: loaded Not tainted 6.6.0-rc5_next-
>>>>queue_19th-Oct-2023-01625-g039e5d15e451 #1
>>>> Hardware name: Intel Corporation S2600STB/S2600STB, BIOS
>>>>SE5C620.86B.02.01.0008.031920191559 03/19/2019
>>>> RIP: 0010:ice_dpll_rclk_state_on_pin_get+0x2f/0x90 [ice]
>>>> Code: 41 57 4d 89 cf 41 56 41 55 4d 89 c5 41 54 55 48 89 f5 53 4c 8b 66
>>>>08 48 89 cb 4d 8d b4 24 f0 49 00 00 4c 89 f7 e8 71 ec 1f c5 <0f> b6 5b
>>>>10
>>>>41 0f b6 84 24 30 4b 00 00 29 c3 41 0f b6 84 24 28 4b
>>>> RSP: 0018:ffffc902b179fb60 EFLAGS: 00010246
>>>> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
>>>> RDX: ffff8882c1398000 RSI: ffff888c7435cc60 RDI: ffff888c7435cb90
>>>> RBP: ffff888c7435cc60 R08: ffffc902b179fbb0 R09: 0000000000000000
>>>> R10: ffff888ef1fc8050 R11: fffffffffff82700 R12: ffff888c743581a0
>>>> R13: ffffc902b179fbb0 R14: ffff888c7435cb90 R15: 0000000000000000
>>>> FS:  00007fdc7dae0740(0000) GS:ffff888c105c0000(0000)
>>>>knlGS:0000000000000000
>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> CR2: 0000000000000010 CR3: 0000000132c24002 CR4: 00000000007706e0
>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>> PKRU: 55555554
>>>> Call Trace:
>>>>   <TASK>
>>>>   ? __die+0x20/0x70
>>>>   ? page_fault_oops+0x76/0x170
>>>>   ? exc_page_fault+0x65/0x150
>>>>   ? asm_exc_page_fault+0x22/0x30
>>>>   ? ice_dpll_rclk_state_on_pin_get+0x2f/0x90 [ice]
>>>>   ? __pfx_ice_dpll_rclk_state_on_pin_get+0x10/0x10 [ice]
>>>>   dpll_msg_add_pin_parents+0x142/0x1d0
>>>>   dpll_pin_event_send+0x7d/0x150
>>>>   dpll_pin_on_pin_unregister+0x3f/0x100
>>>>   ice_dpll_deinit_pins+0xa1/0x230 [ice]
>>>>   ice_dpll_deinit+0x29/0xe0 [ice]
>>>>   ice_remove+0xcd/0x200 [ice]
>>>>   pci_device_remove+0x33/0xa0
>>>>   device_release_driver_internal+0x193/0x200
>>>>   unbind_store+0x9d/0xb0
>>>>   kernfs_fop_write_iter+0x128/0x1c0
>>>>   vfs_write+0x2bb/0x3e0
>>>>   ksys_write+0x5f/0xe0
>>>>   do_syscall_64+0x59/0x90
>>>>   ? filp_close+0x1b/0x30
>>>>   ? do_dup2+0x7d/0xd0
>>>>   ? syscall_exit_work+0x103/0x130
>>>>   ? syscall_exit_to_user_mode+0x22/0x40
>>>>   ? do_syscall_64+0x69/0x90
>>>>   ? syscall_exit_work+0x103/0x130
>>>>   ? syscall_exit_to_user_mode+0x22/0x40
>>>>   ? do_syscall_64+0x69/0x90
>>>>   entry_SYSCALL_64_after_hwframe+0x6e/0xd8
>>>> RIP: 0033:0x7fdc7d93eb97
>>>> Code: 0b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e
>>>>fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0
>>>>ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
>>>> RSP: 002b:00007fff2aa91028 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
>>>> RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fdc7d93eb97
>>>> RDX: 000000000000000d RSI: 00005644814ec9b0 RDI: 0000000000000001
>>>> RBP: 00005644814ec9b0 R08: 0000000000000000 R09: 00007fdc7d9b14e0
>>>> R10: 00007fdc7d9b13e0 R11: 0000000000000246 R12: 000000000000000d
>>>> R13: 00007fdc7d9fb780 R14: 000000000000000d R15: 00007fdc7d9f69e0
>>>>   </TASK>
>>>> Modules linked in: uinput vfio_pci vfio_pci_core vfio_iommu_type1 vfio
>>>>irqbypass ixgbevf snd_seq_dummy snd_hrtimer snd_seq snd_timer
>>>>snd_seq_device snd soundcore overlay qrtr rfkill vfat fat xfs libcrc32c
>>>>rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod
>>>>target_core_mod
>>>>ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm intel_rapl_msr
>>>>intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common
>>>>isst_if_common skx_edac nfit libnvdimm ipmi_ssif x86_pkg_temp_thermal
>>>>intel_powerclamp coretemp irdma rapl intel_cstate ib_uverbs iTCO_wdt
>>>>iTCO_vendor_support acpi_ipmi intel_uncore mei_me ipmi_si pcspkr
>>>>i2c_i801
>>>>ib_core mei ipmi_devintf intel_pch_thermal ioatdma i2c_smbus
>>>>ipmi_msghandler lpc_ich joydev acpi_power_meter acpi_pad ext4 mbcache
>>>>jbd2
>>>>sd_mod t10_pi sg ast i2c_algo_bit drm_shmem_helper drm_kms_helper ice
>>>>crct10dif_pclmul ixgbe crc32_pclmul drm crc32c_intel ahci i40e libahci
>>>>ghash_clmulni_intel libata mdio dca gnss wmi fuse [last unloaded: iavf]
>>>> CR2: 0000000000000010
>>>>
>>>> Arkadiusz Kubalewski (3):
>>>>    dpll: fix pin dump crash after module unbind
>>>>    dpll: fix pin dump crash for rebound module
>>>>    dpll: fix register pin with unregistered parent pin
>>>>
>>>>   drivers/dpll/dpll_core.c    |  8 ++------
>>>>   drivers/dpll/dpll_core.h    |  4 ++--
>>>>   drivers/dpll/dpll_netlink.c | 37 ++++++++++++++++++++++--------------
>>>>-
>>>>   3 files changed, 26 insertions(+), 23 deletions(-)
>>>>
>>>
>>>
>>>I still don't get how can we end up with unregistered pin. And shouldn't
>>>drivers do unregister of dpll/pin during release procedure? I thought it
>>>was kind of agreement we reached while developing the subsystem.
>>>
>>
>>It's definitely not about ending up with unregistered pins.
>>
>>Usually the driver is loaded for PF0, PF1, PF2, PF3 and unloaded in
>>opposite
>>order: PF3, PF2, PF1, PF0. And this is working without any issues.
>
>Please fix this in the driver.
>

Thanks for your feedback, but this is already wrong advice.

Our HW/FW is designed in different way than yours, it doesn't mean it is wrong.
As you might recall from our sync meetings, the dpll subsystem is to unify
approaches and reduce the code in the drivers, where your advice is exactly
opposite, suggested fix would require to implement extra synchronization of the
dpll and pin registration state between driver instances, most probably with
use of additional modules like aux-bus or something similar, which was from the
very beginning something we tried to avoid.
Only ice uses the infrastructure of muxed pins, and this is broken as it
doesn't allow unbind the driver which have registered dpll and pins without
crashing the kernel, so a fix is required in dpll subsystem, not in the driver.

Thank you!
Arkadiusz

>
>>
>>Above crash is caused because of unordered driver unload, where dpll
>>subsystem
>>tries to notify muxed pin was deleted, but at that time the parent is
>>already
>>gone, thus data points to memory which is no longer available, thus crash
>>happens when trying to dump pin parents.
>>
>>This series fixes all issues I could find connected to the situation where
>>muxed-pins are trying to access their parents, when parent registerer was
>>removed
>>in the meantime.
>>
>>Thank you!
>>Arkadiusz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ