lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <D8VDRV58EWSK.MKGI5JBD8RX6@gmail.com>
Date: Tue, 01 Apr 2025 11:43:54 -0300
From: "Kurt Borja" <kuurtb@...il.com>
To: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
Cc: "Henrique de Moraes Holschuh" <hmh@....eng.br>, "Hans de Goede"
 <hdegoede@...hat.com>, "Mark Pearson" <mpearson-lenovo@...ebb.ca>,
 <ibm-acpi-devel@...ts.sourceforge.net>,
 <platform-driver-x86@...r.kernel.org>, "LKML"
 <linux-kernel@...r.kernel.org>, <linux-riscv@...ts.infradead.org>, "Damian
 Tometzki" <damian@...cv-rocks.de>
Subject: Re: [PATCH] platform/x86: thinkpad_acpi: Fix NULL pointer
 dereferences while probing

Hi Ilpo,

On Tue Apr 1, 2025 at 8:24 AM -03, Ilpo Järvinen wrote:
> On Sun, 30 Mar 2025, Kurt Borja wrote:
>
>> Some subdrivers make use of the global reference tpacpi_pdev during
>> initialization, which is called from the platform driver's probe.
>> However, after
>> 
>> commit 38b9ab80db31 ("platform/x86: thinkpad_acpi: Move subdriver initialization to tpacpi_pdriver's probe.")
>> 
>
> Next time, please include these into the paragraph flow normally obeying 
> the normal paragraph formatting. I changed them in this case.

Thanks, won't happen next time.

>
>> this variable is only properly initialized *after* probing and this can
>> result in a NULL pointer dereference.
>> 
>> In order to fix this without reverting the commit, register the platform
>> bundle in two steps, first create and initialize tpacpi_pdev, then
>> register the driver synchronously with platform_driver_probe(). This way
>> the benefits of commit 38b9ab80db31 are preserved.
>> 
>> Additionally,
>> 
>> commit 43fc63a1e8f6 ("platform/x86: thinkpad_acpi: Move HWMON initialization to tpacpi_hwmon_pdriver's probe")
>> 
>> introduced a similar problem, however tpacpi_sensors_pdev is only used
>> once inside the probe, so replace the global reference with the one
>> given by the probe.
>> 
>> Reported-by: Damian Tometzki <damian@...cv-rocks.de>
>> Closes: https://lore.kernel.org/r/CAL=B37kdL1orSQZD2A3skDOevRXBzF__cJJgY_GFh9LZO3FMsw@mail.gmail.com/
>> Fixes: 38b9ab80db31 ("platform/x86: thinkpad_acpi: Move subdriver initialization to tpacpi_pdriver's probe.")
>> Fixes: 43fc63a1e8f6 ("platform/x86: thinkpad_acpi: Move HWMON initialization to tpacpi_hwmon_pdriver's probe")
>> Tested-by: Damian Tometzki <damian@...cv-rocks.de>
>> Signed-off-by: Kurt Borja <kuurtb@...il.com>
>
> Applied to the review-ilpo-fixes branch.

Thank you!

>
>> ---
>> Hi all,
>> 
>> The commit message is pretty self-explanatory. I have one question
>> though. As you can see in the crash dump of the original report:
>> 
>> Mar 29 17:43:16.180758 fedora kernel:  ? asm_exc_page_fault+0x26/0x30
>> Mar 29 17:43:16.180769 fedora kernel:  ? __pfx_klist_children_get+0x10/0x10
>> Mar 29 17:43:16.180781 fedora kernel:  ? kobject_get+0xd/0x70
>> Mar 29 17:43:16.180792 fedora kernel:  device_add+0x8f/0x6e0
>> Mar 29 17:43:16.180804 fedora kernel:  rfkill_register+0xbc/0x2c0 [rfkill]
>> Mar 29 17:43:16.180813 fedora kernel:  tpacpi_new_rfkill+0x185/0x230 [thinkpad_acpi]
>> 
>> The NULL dereference happens in device_add(), inside rfkill_register().
>> This bothers me because, as you can see here:
>> 
>>  1198                 atp_rfk->rfkill = rfkill_alloc(name,
>>  1199                                                 &tpacpi_pdev->dev,
>>  1200                                                 rfktype,
>>  1201                                                 &tpacpi_rfk_rfkill_ops,
>>  1202                                                 atp_rfk);
>> 
>> the NULL deference happens in line 1199, inside tpacpi_new_rfkill(). I
>> think this disagreement might be due to compile time optimizations?
>
> How did you map it to line numbers? Is it just about difference in the 
> compiled binaries that results in different line numbers?

Oh - I just manually followed the dump trace in search of the first
instance of a NULL derefence. If I understand correctly, inside
thinkpad_acpi we do reach rfkill_register(), which is line

 1227         res = rfkill_register(atp_rfk->rfkill);

and I imagine the RIP happens when device_add() tries to get a reference
to the parent of the allocated rfkill device. But it's weird because we
shouldn't even reach 1227, as the NULL deref first happens at 1199.

NULL deref is UB so I guess it makes sense?

BTW I got all these line numbers using the base commit.

-- 
 ~ Kurt

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ