lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <80307204-c401-4411-aa2a-3df7e11a45ce@app.fastmail.com>
Date: Sun, 09 Feb 2025 20:26:52 -0500
From: "Mark Pearson" <mpearson-lenovo@...ebb.ca>
To: "Kurt Borja" <kuurtb@...il.com>
Cc: "Hans de Goede" <hdegoede@...hat.com>,
 Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
 "Limonciello, Mario" <mario.limonciello@....com>,
 "platform-driver-x86@...r.kernel.org" <platform-driver-x86@...r.kernel.org>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH] platform/x86: thinkpad_acpi: Fix registration of tpacpi platform
 driver

Hi,

On Sun, Feb 9, 2025, at 8:18 PM, Kurt Borja wrote:
> Hi Mark,
>
> On Sun Feb 9, 2025 at 7:54 PM -05, Mark Pearson wrote:
>> Hi Kurt,
>>
>> On Sat, Feb 8, 2025, at 8:54 AM, Kurt Borja wrote:
>>> On Sat Feb 8, 2025 at 11:26 AM -05, Mark Pearson wrote:
>>>> Thanks Kurt,
>>>>
>>>> On Fri, Feb 7, 2025, at 11:56 PM, Kurt Borja wrote:
>>>>> Hi Mark,
>>>>>
>>>>> On Sat Feb 8, 2025 at 4:14 AM -05, Mark Pearson wrote:
>>>>>> When reviewing and testing the recent platform profile changes I had
>>>>>> missed that they prevent the tpacpi platform driver from registering.
>>>>>> This error is seen in the kernel logs, and the various tpacpi entries
>>>>>> are not created:
>>>>>> [ 7550.642171] platform thinkpad_acpi: Resources present before probing
>>>>>
>>>>> This happens because in thinkpad_acpi_module_init(), ibm_init() is
>>>>> called before platform_driver_register(&tpacpi_pdriver), therefore
>>>>> devm_platform_profile_register() is called before tpacpi_pdev probes.
>>>>>
>>>>> As you can verify in [1], in the probing sequence, the driver core
>>>>> verifies the devres list is empty, which in this case is not because of
>>>>> the devm_ call.
>>>>>
>>>>>>
>>>>>> I believe this is because the platform_profile driver registers the
>>>>>> device as part of it's initialisation in devm_platform_profile_register,
>>>>>> and the thinkpad_acpi driver later fails as the resource is already used.
>>>>>>
>>>>>> Modified thinkpad_acpi so that it has a separate platform driver for the
>>>>>> profile handling, leaving the existing tpacpi_pdev to register
>>>>>> successfully.
>>>>>
>>>>> While this works, it does not address the problem directly. Also it is
>>>>> discouraged to create "fake" platform devices [2].
>>>>>
>>>>> May I suggest moving tpacpi_pdriver registration before ibm_init()
>>>>> instead, so ibm_init_struct's .init callbacks can use devres?
>>>>>
>>>>
>>>> Yep - you're right. Moving it before the init does also fix it.
>>>>
>>>> I can't see any issues with this approach, but I'll test it out a bit more carefully and do an updated version with this approach.
>>>
>>> Thinking about it a bit more. With this approach you should maybe create
>>> the tpacpi_pdev with platform_create_bundle() (I'm pretty sure you can
>>> pass a NULL (*probe) callback) to avoid races.
>>>
>>> platform_create_bundle() only returns after the device has been
>>> successfully bound to the driver.
>>>
>> Unfortunately having a null probe callback doesn't work - you end up with an oops for a null pointer dereference.
>
> Are you sure? I just tested this on the for-next branch and it works
> without issues. Also checked the code and (*probe) is only dereferenced
> safely inside platform_bus_type's probe. Maybe another pointer is being
> deferenced? Keep in mind that platform_create_bundle() also registers
> the driver so maybe there is an issue there.
>
Possibly - I have to admit I didn't go dig too hard, as when I added it I got:

Feb 09 19:41:17 x1c12 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000028
Feb 09 19:41:17 x1c12 kernel: #PF: supervisor read access in kernel mode
Feb 09 19:41:17 x1c12 kernel: #PF: error_code(0x0000) - not-present page

With bus_probe_device in the backtrace - and went 'oh well'.

Are there any significant advantages to the approach that make it worthwhile debugging further what is going on? Moving the platform_driver_register is working nicely :)

Thanks
Mark

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ