lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <10339f70-f56c-4ea3-874e-765bc8d63340@app.fastmail.com>
Date: Sun, 09 Feb 2025 22:14:16 -0500
From: "Mark Pearson" <mpearson-lenovo@...ebb.ca>
To: "Kurt Borja" <kuurtb@...il.com>
Cc: "Hans de Goede" <hdegoede@...hat.com>,
 Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
 "Limonciello, Mario" <mario.limonciello@....com>,
 "platform-driver-x86@...r.kernel.org" <platform-driver-x86@...r.kernel.org>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH] platform/x86: thinkpad_acpi: Fix registration of tpacpi platform
 driver

On Sun, Feb 9, 2025, at 10:04 PM, Kurt Borja wrote:
> On Sun Feb 9, 2025 at 9:35 PM -05, Mark Pearson wrote:
>> On Sun, Feb 9, 2025, at 9:10 PM, Kurt Borja wrote:
>>> On Sun Feb 9, 2025 at 8:26 PM -05, Mark Pearson wrote:
>>>> Hi,
>>>>
>>>> On Sun, Feb 9, 2025, at 8:18 PM, Kurt Borja wrote:
>>>>> Hi Mark,
>>>>>
>>>>> On Sun Feb 9, 2025 at 7:54 PM -05, Mark Pearson wrote:
>>>>>> Hi Kurt,
>>>>>>
>>>>>> On Sat, Feb 8, 2025, at 8:54 AM, Kurt Borja wrote:
>>>>>>> On Sat Feb 8, 2025 at 11:26 AM -05, Mark Pearson wrote:
>>>>>>>> Thanks Kurt,
>>>>>>>>
>>>>>>>> On Fri, Feb 7, 2025, at 11:56 PM, Kurt Borja wrote:
>>>>>>>>> Hi Mark,
>>>>>>>>>
>>>>>>>>> On Sat Feb 8, 2025 at 4:14 AM -05, Mark Pearson wrote:
>>>>>>>>>> When reviewing and testing the recent platform profile changes I had
>>>>>>>>>> missed that they prevent the tpacpi platform driver from registering.
>>>>>>>>>> This error is seen in the kernel logs, and the various tpacpi entries
>>>>>>>>>> are not created:
>>>>>>>>>> [ 7550.642171] platform thinkpad_acpi: Resources present before probing
>>>>>>>>>
>>>>>>>>> This happens because in thinkpad_acpi_module_init(), ibm_init() is
>>>>>>>>> called before platform_driver_register(&tpacpi_pdriver), therefore
>>>>>>>>> devm_platform_profile_register() is called before tpacpi_pdev probes.
>>>>>>>>>
>>>>>>>>> As you can verify in [1], in the probing sequence, the driver core
>>>>>>>>> verifies the devres list is empty, which in this case is not because of
>>>>>>>>> the devm_ call.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I believe this is because the platform_profile driver registers the
>>>>>>>>>> device as part of it's initialisation in devm_platform_profile_register,
>>>>>>>>>> and the thinkpad_acpi driver later fails as the resource is already used.
>>>>>>>>>>
>>>>>>>>>> Modified thinkpad_acpi so that it has a separate platform driver for the
>>>>>>>>>> profile handling, leaving the existing tpacpi_pdev to register
>>>>>>>>>> successfully.
>>>>>>>>>
>>>>>>>>> While this works, it does not address the problem directly. Also it is
>>>>>>>>> discouraged to create "fake" platform devices [2].
>>>>>>>>>
>>>>>>>>> May I suggest moving tpacpi_pdriver registration before ibm_init()
>>>>>>>>> instead, so ibm_init_struct's .init callbacks can use devres?
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yep - you're right. Moving it before the init does also fix it.
>>>>>>>>
>>>>>>>> I can't see any issues with this approach, but I'll test it out a bit more carefully and do an updated version with this approach.
>>>>>>>
>>>>>>> Thinking about it a bit more. With this approach you should maybe create
>>>>>>> the tpacpi_pdev with platform_create_bundle() (I'm pretty sure you can
>>>>>>> pass a NULL (*probe) callback) to avoid races.
>>>>>>>
>>>>>>> platform_create_bundle() only returns after the device has been
>>>>>>> successfully bound to the driver.
>>>>>>>
>>>>>> Unfortunately having a null probe callback doesn't work - you end up with an oops for a null pointer dereference.
>>>>>
>>>>> Are you sure? I just tested this on the for-next branch and it works
>>>>> without issues. Also checked the code and (*probe) is only dereferenced
>>>>> safely inside platform_bus_type's probe. Maybe another pointer is being
>>>>> deferenced? Keep in mind that platform_create_bundle() also registers
>>>>> the driver so maybe there is an issue there.
>>>>>
>>>> Possibly - I have to admit I didn't go dig too hard, as when I added it I got:
>>>>
>>>> Feb 09 19:41:17 x1c12 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000028
>>>> Feb 09 19:41:17 x1c12 kernel: #PF: supervisor read access in kernel mode
>>>> Feb 09 19:41:17 x1c12 kernel: #PF: error_code(0x0000) - not-present page
>>>>
>>>> With bus_probe_device in the backtrace - and went 'oh well'.
>>>>
>>>> Are there any significant advantages to the approach that make it worthwhile debugging further what is going on? Moving the platform_driver_register is working nicely :)
>>>
>>> Now that I think about it maybe there is no significant advantages, at
>>> least in relation to
>>>
>>> 	[ 7550.642171] platform thinkpad_acpi: Resources present before probing
>>>
>>> because list_empty(&dev->devres_head) is checked synchronously.
>>>
>>> However, now the null deref worries me, because some sysfs groups are
>>> created on driver binding. Do you have the full backtrace? Just to be
>>> sure moving driver registration doesn't mess with anything.
>>
>> Oooops...
>> I didn't have the trace (at least that I could find) but I figured it would be easy to recreate it.
>> Went to make the change again...and realised what I had got wrong.
>> I needed to replace:
>>         tpacpi_pdev = platform_device_register_simple(TPACPI_DRVR_NAME, PLATFORM_DEVID_NONE,
>>                                                         NULL, 0);
>> with 
>>         tpacpi_pdev = platform_create_bundle(&tpacpi_pdriver, NULL, NULL, 0, NULL, 0);
>>
>> (previously I had replaced the platform_driver_register...sigh)
>>
>> With the change done, I think, correctly - no oops and everything is working.
>
> Good to know!
>
> I'm going through the sysfs groups attached to the platform device and I
> noticed some attributes may depend on subdrivers being initialized
> first. If this is the case, ibm_init() has to be called inside the
> platform driver probe for this to work. Like this:
>
> static int tpacpi_probe(struct platform_device *pdev)
> {
> 	/* Input init */
> 	...
> 	/* Subdrivers init */
> 	...
> 		ret = ibm_init(&ibms_init[i]);
> 	...
> }
>
> static int __init thinkpad_acpi_module_init(void)
> {
> 	...
> 	tpacpi_pdev = platform_create_bundle(&tpacpi_pdriver, tpacpi_probe,
> 					     NULL, 0, NULL, 0);
> 	...
> }
>
> Of course this complicates things, so another approach is to just use
>
> 	platform_profile_register()
>
> instead of the devm_ version.
>
> Of course, the first approach has the advantage that devres is now
> usable, so I'd go for that, but that's for you to decide.
>
Hmmmm.
I'd like to get a fix in so anybody updating to 6.13 don't get hit (the Fedora users will be getting it soon and Arch users are likely seeing it already).

I think I'll do a fix tomorrow using the non devm_ version as that's safest; and then take a bit more time with implementing a probe with the appropriate pieces.

Thanks again
Mark

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ