lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <D7OG3RQ7HSH0.1V6FHLWMNC5BW@gmail.com>
Date: Sun, 09 Feb 2025 22:28:51 -0500
From: "Kurt Borja" <kuurtb@...il.com>
To: "Mark Pearson" <mpearson-lenovo@...ebb.ca>
Cc: "Hans de Goede" <hdegoede@...hat.com>, Ilpo Järvinen
 <ilpo.jarvinen@...ux.intel.com>, "Limonciello, Mario"
 <mario.limonciello@....com>, "platform-driver-x86@...r.kernel.org"
 <platform-driver-x86@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] platform/x86: thinkpad_acpi: Fix registration of tpacpi
 platform driver

On Sun Feb 9, 2025 at 10:14 PM -05, Mark Pearson wrote:
> On Sun, Feb 9, 2025, at 10:04 PM, Kurt Borja wrote:
>> On Sun Feb 9, 2025 at 9:35 PM -05, Mark Pearson wrote:
>>> On Sun, Feb 9, 2025, at 9:10 PM, Kurt Borja wrote:
>>>> On Sun Feb 9, 2025 at 8:26 PM -05, Mark Pearson wrote:
>>>>> Hi,
>>>>>
>>>>> On Sun, Feb 9, 2025, at 8:18 PM, Kurt Borja wrote:
>>>>>> Hi Mark,
>>>>>>
>>>>>> On Sun Feb 9, 2025 at 7:54 PM -05, Mark Pearson wrote:
>>>>>>> Hi Kurt,
>>>>>>>
>>>>>>> On Sat, Feb 8, 2025, at 8:54 AM, Kurt Borja wrote:
>>>>>>>> On Sat Feb 8, 2025 at 11:26 AM -05, Mark Pearson wrote:
>>>>>>>>> Thanks Kurt,
>>>>>>>>>
>>>>>>>>> On Fri, Feb 7, 2025, at 11:56 PM, Kurt Borja wrote:
>>>>>>>>>> Hi Mark,
>>>>>>>>>>
>>>>>>>>>> On Sat Feb 8, 2025 at 4:14 AM -05, Mark Pearson wrote:
>>>>>>>>>>> When reviewing and testing the recent platform profile changes I had
>>>>>>>>>>> missed that they prevent the tpacpi platform driver from registering.
>>>>>>>>>>> This error is seen in the kernel logs, and the various tpacpi entries
>>>>>>>>>>> are not created:
>>>>>>>>>>> [ 7550.642171] platform thinkpad_acpi: Resources present before probing
>>>>>>>>>>
>>>>>>>>>> This happens because in thinkpad_acpi_module_init(), ibm_init() is
>>>>>>>>>> called before platform_driver_register(&tpacpi_pdriver), therefore
>>>>>>>>>> devm_platform_profile_register() is called before tpacpi_pdev probes.
>>>>>>>>>>
>>>>>>>>>> As you can verify in [1], in the probing sequence, the driver core
>>>>>>>>>> verifies the devres list is empty, which in this case is not because of
>>>>>>>>>> the devm_ call.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I believe this is because the platform_profile driver registers the
>>>>>>>>>>> device as part of it's initialisation in devm_platform_profile_register,
>>>>>>>>>>> and the thinkpad_acpi driver later fails as the resource is already used.
>>>>>>>>>>>
>>>>>>>>>>> Modified thinkpad_acpi so that it has a separate platform driver for the
>>>>>>>>>>> profile handling, leaving the existing tpacpi_pdev to register
>>>>>>>>>>> successfully.
>>>>>>>>>>
>>>>>>>>>> While this works, it does not address the problem directly. Also it is
>>>>>>>>>> discouraged to create "fake" platform devices [2].
>>>>>>>>>>
>>>>>>>>>> May I suggest moving tpacpi_pdriver registration before ibm_init()
>>>>>>>>>> instead, so ibm_init_struct's .init callbacks can use devres?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yep - you're right. Moving it before the init does also fix it.
>>>>>>>>>
>>>>>>>>> I can't see any issues with this approach, but I'll test it out a bit more carefully and do an updated version with this approach.
>>>>>>>>
>>>>>>>> Thinking about it a bit more. With this approach you should maybe create
>>>>>>>> the tpacpi_pdev with platform_create_bundle() (I'm pretty sure you can
>>>>>>>> pass a NULL (*probe) callback) to avoid races.
>>>>>>>>
>>>>>>>> platform_create_bundle() only returns after the device has been
>>>>>>>> successfully bound to the driver.
>>>>>>>>
>>>>>>> Unfortunately having a null probe callback doesn't work - you end up with an oops for a null pointer dereference.
>>>>>>
>>>>>> Are you sure? I just tested this on the for-next branch and it works
>>>>>> without issues. Also checked the code and (*probe) is only dereferenced
>>>>>> safely inside platform_bus_type's probe. Maybe another pointer is being
>>>>>> deferenced? Keep in mind that platform_create_bundle() also registers
>>>>>> the driver so maybe there is an issue there.
>>>>>>
>>>>> Possibly - I have to admit I didn't go dig too hard, as when I added it I got:
>>>>>
>>>>> Feb 09 19:41:17 x1c12 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000028
>>>>> Feb 09 19:41:17 x1c12 kernel: #PF: supervisor read access in kernel mode
>>>>> Feb 09 19:41:17 x1c12 kernel: #PF: error_code(0x0000) - not-present page
>>>>>
>>>>> With bus_probe_device in the backtrace - and went 'oh well'.
>>>>>
>>>>> Are there any significant advantages to the approach that make it worthwhile debugging further what is going on? Moving the platform_driver_register is working nicely :)
>>>>
>>>> Now that I think about it maybe there is no significant advantages, at
>>>> least in relation to
>>>>
>>>> 	[ 7550.642171] platform thinkpad_acpi: Resources present before probing
>>>>
>>>> because list_empty(&dev->devres_head) is checked synchronously.
>>>>
>>>> However, now the null deref worries me, because some sysfs groups are
>>>> created on driver binding. Do you have the full backtrace? Just to be
>>>> sure moving driver registration doesn't mess with anything.
>>>
>>> Oooops...
>>> I didn't have the trace (at least that I could find) but I figured it would be easy to recreate it.
>>> Went to make the change again...and realised what I had got wrong.
>>> I needed to replace:
>>>         tpacpi_pdev = platform_device_register_simple(TPACPI_DRVR_NAME, PLATFORM_DEVID_NONE,
>>>                                                         NULL, 0);
>>> with 
>>>         tpacpi_pdev = platform_create_bundle(&tpacpi_pdriver, NULL, NULL, 0, NULL, 0);
>>>
>>> (previously I had replaced the platform_driver_register...sigh)
>>>
>>> With the change done, I think, correctly - no oops and everything is working.
>>
>> Good to know!
>>
>> I'm going through the sysfs groups attached to the platform device and I
>> noticed some attributes may depend on subdrivers being initialized
>> first. If this is the case, ibm_init() has to be called inside the
>> platform driver probe for this to work. Like this:
>>
>> static int tpacpi_probe(struct platform_device *pdev)
>> {
>> 	/* Input init */
>> 	...
>> 	/* Subdrivers init */
>> 	...
>> 		ret = ibm_init(&ibms_init[i]);
>> 	...
>> }
>>
>> static int __init thinkpad_acpi_module_init(void)
>> {
>> 	...
>> 	tpacpi_pdev = platform_create_bundle(&tpacpi_pdriver, tpacpi_probe,
>> 					     NULL, 0, NULL, 0);
>> 	...
>> }
>>
>> Of course this complicates things, so another approach is to just use
>>
>> 	platform_profile_register()
>>
>> instead of the devm_ version.
>>
>> Of course, the first approach has the advantage that devres is now
>> usable, so I'd go for that, but that's for you to decide.
>>
> Hmmmm.
> I'd like to get a fix in so anybody updating to 6.13 don't get hit (the Fedora users will be getting it soon and Arch users are likely seeing it already).

Thankfully Arch does not distribute rc versions.

>
> I think I'll do a fix tomorrow using the non devm_ version as that's safest; and then take a bit more time with implementing a probe with the appropriate pieces.

I agree!

>
> Thanks again

Thanks for fixing it. This one is on me, I naively assumed the device
was alread bound.

--
~ Kurt

> Mark


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ