lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <D7OEFLKWR8VN.247WSBSGFSD2X@gmail.com>
Date: Sun, 09 Feb 2025 21:10:15 -0500
From: "Kurt Borja" <kuurtb@...il.com>
To: "Mark Pearson" <mpearson-lenovo@...ebb.ca>
Cc: "Hans de Goede" <hdegoede@...hat.com>, Ilpo Järvinen
 <ilpo.jarvinen@...ux.intel.com>, "Limonciello, Mario"
 <mario.limonciello@....com>, "platform-driver-x86@...r.kernel.org"
 <platform-driver-x86@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] platform/x86: thinkpad_acpi: Fix registration of tpacpi
 platform driver

On Sun Feb 9, 2025 at 8:26 PM -05, Mark Pearson wrote:
> Hi,
>
> On Sun, Feb 9, 2025, at 8:18 PM, Kurt Borja wrote:
>> Hi Mark,
>>
>> On Sun Feb 9, 2025 at 7:54 PM -05, Mark Pearson wrote:
>>> Hi Kurt,
>>>
>>> On Sat, Feb 8, 2025, at 8:54 AM, Kurt Borja wrote:
>>>> On Sat Feb 8, 2025 at 11:26 AM -05, Mark Pearson wrote:
>>>>> Thanks Kurt,
>>>>>
>>>>> On Fri, Feb 7, 2025, at 11:56 PM, Kurt Borja wrote:
>>>>>> Hi Mark,
>>>>>>
>>>>>> On Sat Feb 8, 2025 at 4:14 AM -05, Mark Pearson wrote:
>>>>>>> When reviewing and testing the recent platform profile changes I had
>>>>>>> missed that they prevent the tpacpi platform driver from registering.
>>>>>>> This error is seen in the kernel logs, and the various tpacpi entries
>>>>>>> are not created:
>>>>>>> [ 7550.642171] platform thinkpad_acpi: Resources present before probing
>>>>>>
>>>>>> This happens because in thinkpad_acpi_module_init(), ibm_init() is
>>>>>> called before platform_driver_register(&tpacpi_pdriver), therefore
>>>>>> devm_platform_profile_register() is called before tpacpi_pdev probes.
>>>>>>
>>>>>> As you can verify in [1], in the probing sequence, the driver core
>>>>>> verifies the devres list is empty, which in this case is not because of
>>>>>> the devm_ call.
>>>>>>
>>>>>>>
>>>>>>> I believe this is because the platform_profile driver registers the
>>>>>>> device as part of it's initialisation in devm_platform_profile_register,
>>>>>>> and the thinkpad_acpi driver later fails as the resource is already used.
>>>>>>>
>>>>>>> Modified thinkpad_acpi so that it has a separate platform driver for the
>>>>>>> profile handling, leaving the existing tpacpi_pdev to register
>>>>>>> successfully.
>>>>>>
>>>>>> While this works, it does not address the problem directly. Also it is
>>>>>> discouraged to create "fake" platform devices [2].
>>>>>>
>>>>>> May I suggest moving tpacpi_pdriver registration before ibm_init()
>>>>>> instead, so ibm_init_struct's .init callbacks can use devres?
>>>>>>
>>>>>
>>>>> Yep - you're right. Moving it before the init does also fix it.
>>>>>
>>>>> I can't see any issues with this approach, but I'll test it out a bit more carefully and do an updated version with this approach.
>>>>
>>>> Thinking about it a bit more. With this approach you should maybe create
>>>> the tpacpi_pdev with platform_create_bundle() (I'm pretty sure you can
>>>> pass a NULL (*probe) callback) to avoid races.
>>>>
>>>> platform_create_bundle() only returns after the device has been
>>>> successfully bound to the driver.
>>>>
>>> Unfortunately having a null probe callback doesn't work - you end up with an oops for a null pointer dereference.
>>
>> Are you sure? I just tested this on the for-next branch and it works
>> without issues. Also checked the code and (*probe) is only dereferenced
>> safely inside platform_bus_type's probe. Maybe another pointer is being
>> deferenced? Keep in mind that platform_create_bundle() also registers
>> the driver so maybe there is an issue there.
>>
> Possibly - I have to admit I didn't go dig too hard, as when I added it I got:
>
> Feb 09 19:41:17 x1c12 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000028
> Feb 09 19:41:17 x1c12 kernel: #PF: supervisor read access in kernel mode
> Feb 09 19:41:17 x1c12 kernel: #PF: error_code(0x0000) - not-present page
>
> With bus_probe_device in the backtrace - and went 'oh well'.
>
> Are there any significant advantages to the approach that make it worthwhile debugging further what is going on? Moving the platform_driver_register is working nicely :)

Now that I think about it maybe there is no significant advantages, at
least in relation to

	[ 7550.642171] platform thinkpad_acpi: Resources present before probing

because list_empty(&dev->devres_head) is checked synchronously.

However, now the null deref worries me, because some sysfs groups are
created on driver binding. Do you have the full backtrace? Just to be
sure moving driver registration doesn't mess with anything.

I apologize for turning a quick fix into this :p

--
~ Kurt

>
> Thanks
> Mark


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ