lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b7028264-167a-410a-af0c-a9aa156ca993@arm.com>
Date: Tue, 9 Jul 2024 10:49:03 +0100
From: Lukasz Luba <lukasz.luba@....com>
To: "Rafael J. Wysocki" <rafael@...nel.org>,
 AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>
Cc: Nícolas F. R. A. Prado <nfraprado@...labora.com>,
 Daniel Lezcano <daniel.lezcano@...aro.org>, Zhang Rui <rui.zhang@...el.com>,
 Nikita Travkin <nikita@...n.ru>, kernel@...labora.com,
 "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>, linux-pm@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH] thermal: gov_power_allocator: Return early in manage if
 trip_max is null



On 7/3/24 21:20, Rafael J. Wysocki wrote:
> On Wed, Jul 3, 2024 at 11:03 AM AngeloGioacchino Del Regno
> <angelogioacchino.delregno@...labora.com> wrote:
>>
>> Il 02/07/24 23:24, Nícolas F. R. A. Prado ha scritto:
>>> Commit da781936e7c3 ("thermal: gov_power_allocator: Allow binding
>>> without trip points") allowed the governor to bind even when trip_max
>>> is null. This allows a null pointer dereference to happen in the manage
>>> callback. Add an early return to prevent it, since the governor is
>>> expected to not do anything in this case.
>>>
>>> Fixes: da781936e7c3 ("thermal: gov_power_allocator: Allow binding without trip points")
>>> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@...labora.com>
>>> ---
>>> This issue was noticed by KernelCI during a boot test on the
>>> mt8195-cherry-tomato-r2 platform with the config in [1]. The stack trace
>>> is attached below.
>>>
>>> [1] http://0x0.st/XaON.txt
>>>
>>> [    4.015786] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
>>> [    4.015791] Mem abort info:
>>> [    4.015793]   ESR = 0x0000000096000004
>>> [    4.015796]   EC = 0x25: DABT (current EL), IL = 32 bits
>>> [    4.015799]   SET = 0, FnV = 0
>>> [    4.015802]   EA = 0, S1PTW = 0
>>> [    4.015804]   FSC = 0x04: level 0 translation fault
>>> [    4.015807] Data abort info:
>>> [    4.015809]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
>>> [    4.015811]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
>>> [    4.015814]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>>> [    4.015818] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000109809000
>>> [    4.015821] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
>>> [    4.015835] Modules linked in: mt8195_mt6359(+) mt6577_auxadc snd_soc_mt8195_afe mtk_scp_ipi snd_sof_utils mtk_wdt(+)
>>> [    4.015852] CPU: 2 PID: 13 Comm: kworker/u32:1 Not tainted 6.10.0-rc6 #1 c5d519ae8e7fec6bbe67cb8c50bfebcb89dfa54e
>>> [    4.015859] Hardware name: Acer Tomato (rev2) board (DT)
>>> [    4.015862] Workqueue: events_unbound deferred_probe_work_func
>>> [    4.015875] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>> [    4.015880] pc : power_allocator_manage+0x110/0x6a0
>>> [    4.015888] lr : __thermal_zone_device_update+0x1dc/0x400
>>> [    4.015893] sp : ffff8000800eb800
>>> [    4.015895] x29: ffff8000800eb810 x28: 0000000000000001 x27: 0000000000000001
>>> [    4.015903] x26: aaaaaaaaaaaaaaab x25: ffff07a0461c15a0 x24: ffffb58530ca67c0
>>> [    4.015911] x23: 0000000000000000 x22: ffff07a04098fcc0 x21: ffffb58532eec848
>>> [    4.015918] x20: ffff8000800eb920 x19: ffff07a0461c1000 x18: 0000000000000b4b
>>> [    4.015926] x17: 5359534255530031 x16: ffffb585310352e4 x15: 0000000000000020
>>> [    4.015933] x14: 0000000000000000 x13: ffffffff00000000 x12: 0000000000000040
>>> [    4.015940] x11: 0101010101010101 x10: ffffffffffffffff x9 : ffffb58530ca8d78
>>> [    4.015948] x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f x6 : 0000000000001388
>>> [    4.015955] x5 : 0000000000000000 x4 : 0000000000000384 x3 : 0000000000000000
>>> [    4.015962] x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000
>>> [    4.015970] Call trace:
>>> [    4.015972]  power_allocator_manage+0x110/0x6a0
>>> [    4.015978]  __thermal_zone_device_update+0x1dc/0x400
>>> [    4.015983]  thermal_zone_device_set_mode+0x7c/0xa0
>>> [    4.015987]  thermal_zone_device_enable+0x1c/0x28
>>> [    4.015991]  thermal_of_zone_register+0x43c/0x498
>>> [    4.015996]  devm_thermal_of_zone_register+0x6c/0xb8
>>> [    4.016001]  gadc_thermal_probe+0x140/0x214
>>> [    4.016007]  platform_probe+0x70/0xc4
>>> [    4.016012]  really_probe+0x140/0x270
>>> [    4.016018]  __driver_probe_device+0xfc/0x114
>>> [    4.016024]  driver_probe_device+0x44/0x100
>>> [    4.016029]  __device_attach_driver+0x64/0xdc
>>> [    4.016035]  bus_for_each_drv+0xb4/0xdc
>>> [    4.016041]  __device_attach+0xdc/0x16c
>>> [    4.016046]  device_initial_probe+0x1c/0x28
>>> [    4.016052]  bus_probe_device+0x44/0xac
>>> [    4.016057]  deferred_probe_work_func+0xb0/0xc4
>>> [    4.016063]  process_scheduled_works+0x114/0x330
>>> [    4.016070]  worker_thread+0x1c0/0x20c
>>> [    4.016076]  kthread+0xf8/0x108
>>> [    4.016081]  ret_from_fork+0x10/0x20
>>> [    4.016090] Code: d1030294 17ffffdd f94012c0 f9401ed7 (b9400000)
>>> [    4.016095] ---[ end trace 0000000000000000 ]---
>>> ---
>>>    drivers/thermal/gov_power_allocator.c | 3 +++
>>>    1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/thermal/gov_power_allocator.c b/drivers/thermal/gov_power_allocator.c
>>> index 45f04a25255a..1b2345a697c5 100644
>>> --- a/drivers/thermal/gov_power_allocator.c
>>> +++ b/drivers/thermal/gov_power_allocator.c
>>> @@ -759,6 +759,9 @@ static void power_allocator_manage(struct thermal_zone_device *tz)
>>>                return;
>>>        }
>>>
>>> +     if (!params->trip_max)
>>> +             return;
>>> +
>>
>> I'm not sure that this is the right thing to do.
>>
>> If you do that, allocate_power() will never be called, so the entire algo doesn't
>> work, making binding this completely useless (as it's going to be a noop..!).
>>
>> Check what get_governor_trips() says in the documentation:
>>
>>    * If there is only one trip point, then that's considered to be the
>>    * "maximum desired temperature" trip point and the governor is always
>>    * on.  If there are no passive or active trip points, then the
>>    * governor won't do anything.  In fact, its throttle function
>>    * won't be called at all.
>>
>> ....and it looks like you're aware of that, as you said that in the commit
>> description as well.
> 
> IIUC, the problematic commit allowed the power allocator governor to
> bind to a tripless thermal zone in order to prevent failing the entire
> thermal zone registration.
> 
> Yes, it will be a noop in this case because in the absence of any
> trips there will be nothing to do for it.  Still, user space can check
> the zone temperature via sysfs.
> 
> Adding a NULL pointer check before the place where the pointer in
> question is dereferenced is not a bad idea at all.


Yes, I agree. My apologies for being late.

Thanks Rafael for applying the patch.

Lukasz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ