[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <f9047afd-d395-4733-a953-b7efa56e66c5@app.fastmail.com>
Date: Sun, 09 Feb 2025 21:35:34 -0500
From: "Mark Pearson" <mpearson-lenovo@...ebb.ca>
To: "Kurt Borja" <kuurtb@...il.com>
Cc: "Hans de Goede" <hdegoede@...hat.com>,
Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
"Limonciello, Mario" <mario.limonciello@....com>,
"platform-driver-x86@...r.kernel.org" <platform-driver-x86@...r.kernel.org>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] platform/x86: thinkpad_acpi: Fix registration of tpacpi platform
driver
On Sun, Feb 9, 2025, at 9:10 PM, Kurt Borja wrote:
> On Sun Feb 9, 2025 at 8:26 PM -05, Mark Pearson wrote:
>> Hi,
>>
>> On Sun, Feb 9, 2025, at 8:18 PM, Kurt Borja wrote:
>>> Hi Mark,
>>>
>>> On Sun Feb 9, 2025 at 7:54 PM -05, Mark Pearson wrote:
>>>> Hi Kurt,
>>>>
>>>> On Sat, Feb 8, 2025, at 8:54 AM, Kurt Borja wrote:
>>>>> On Sat Feb 8, 2025 at 11:26 AM -05, Mark Pearson wrote:
>>>>>> Thanks Kurt,
>>>>>>
>>>>>> On Fri, Feb 7, 2025, at 11:56 PM, Kurt Borja wrote:
>>>>>>> Hi Mark,
>>>>>>>
>>>>>>> On Sat Feb 8, 2025 at 4:14 AM -05, Mark Pearson wrote:
>>>>>>>> When reviewing and testing the recent platform profile changes I had
>>>>>>>> missed that they prevent the tpacpi platform driver from registering.
>>>>>>>> This error is seen in the kernel logs, and the various tpacpi entries
>>>>>>>> are not created:
>>>>>>>> [ 7550.642171] platform thinkpad_acpi: Resources present before probing
>>>>>>>
>>>>>>> This happens because in thinkpad_acpi_module_init(), ibm_init() is
>>>>>>> called before platform_driver_register(&tpacpi_pdriver), therefore
>>>>>>> devm_platform_profile_register() is called before tpacpi_pdev probes.
>>>>>>>
>>>>>>> As you can verify in [1], in the probing sequence, the driver core
>>>>>>> verifies the devres list is empty, which in this case is not because of
>>>>>>> the devm_ call.
>>>>>>>
>>>>>>>>
>>>>>>>> I believe this is because the platform_profile driver registers the
>>>>>>>> device as part of it's initialisation in devm_platform_profile_register,
>>>>>>>> and the thinkpad_acpi driver later fails as the resource is already used.
>>>>>>>>
>>>>>>>> Modified thinkpad_acpi so that it has a separate platform driver for the
>>>>>>>> profile handling, leaving the existing tpacpi_pdev to register
>>>>>>>> successfully.
>>>>>>>
>>>>>>> While this works, it does not address the problem directly. Also it is
>>>>>>> discouraged to create "fake" platform devices [2].
>>>>>>>
>>>>>>> May I suggest moving tpacpi_pdriver registration before ibm_init()
>>>>>>> instead, so ibm_init_struct's .init callbacks can use devres?
>>>>>>>
>>>>>>
>>>>>> Yep - you're right. Moving it before the init does also fix it.
>>>>>>
>>>>>> I can't see any issues with this approach, but I'll test it out a bit more carefully and do an updated version with this approach.
>>>>>
>>>>> Thinking about it a bit more. With this approach you should maybe create
>>>>> the tpacpi_pdev with platform_create_bundle() (I'm pretty sure you can
>>>>> pass a NULL (*probe) callback) to avoid races.
>>>>>
>>>>> platform_create_bundle() only returns after the device has been
>>>>> successfully bound to the driver.
>>>>>
>>>> Unfortunately having a null probe callback doesn't work - you end up with an oops for a null pointer dereference.
>>>
>>> Are you sure? I just tested this on the for-next branch and it works
>>> without issues. Also checked the code and (*probe) is only dereferenced
>>> safely inside platform_bus_type's probe. Maybe another pointer is being
>>> deferenced? Keep in mind that platform_create_bundle() also registers
>>> the driver so maybe there is an issue there.
>>>
>> Possibly - I have to admit I didn't go dig too hard, as when I added it I got:
>>
>> Feb 09 19:41:17 x1c12 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000028
>> Feb 09 19:41:17 x1c12 kernel: #PF: supervisor read access in kernel mode
>> Feb 09 19:41:17 x1c12 kernel: #PF: error_code(0x0000) - not-present page
>>
>> With bus_probe_device in the backtrace - and went 'oh well'.
>>
>> Are there any significant advantages to the approach that make it worthwhile debugging further what is going on? Moving the platform_driver_register is working nicely :)
>
> Now that I think about it maybe there is no significant advantages, at
> least in relation to
>
> [ 7550.642171] platform thinkpad_acpi: Resources present before probing
>
> because list_empty(&dev->devres_head) is checked synchronously.
>
> However, now the null deref worries me, because some sysfs groups are
> created on driver binding. Do you have the full backtrace? Just to be
> sure moving driver registration doesn't mess with anything.
Oooops...
I didn't have the trace (at least that I could find) but I figured it would be easy to recreate it.
Went to make the change again...and realised what I had got wrong.
I needed to replace:
tpacpi_pdev = platform_device_register_simple(TPACPI_DRVR_NAME, PLATFORM_DEVID_NONE,
NULL, 0);
with
tpacpi_pdev = platform_create_bundle(&tpacpi_pdriver, NULL, NULL, 0, NULL, 0);
(previously I had replaced the platform_driver_register...sigh)
With the change done, I think, correctly - no oops and everything is working.
>
> I apologize for turning a quick fix into this :p
No worries - I'd never come across platform_create_bundle so it's a good learning experience for me. Thanks for all the help.
Mark
Powered by blists - more mailing lists