[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGXv+5GX=7-4NVLtGtihEuNGbaeV3E+AwK=3iWqOwF5-XTyCaA@mail.gmail.com>
Date: Tue, 9 Jan 2024 11:45:43 +0800
From: Chen-Yu Tsai <wenst@...omium.org>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: Daniel Lezcano <daniel.lezcano@...aro.org>, Zhang Rui <rui.zhang@...el.com>,
Lukasz Luba <lukasz.luba@....com>, linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] thermal/core: Correctly free tz->tzp in thermal zone
registration error path
On Tue, Dec 19, 2023 at 11:28 PM Rafael J. Wysocki <rafael@...nel.org> wrote:
>
> On Tue, Dec 19, 2023 at 9:27 AM Chen-Yu Tsai <wenst@...omium.org> wrote:
> >
> > After commit 3d439b1a2ad3 ("thermal/core: Alloc-copy-free the thermal
> > zone parameters structure"), the core now copies the thermal zone
> > parameters structure, and frees it if an error happens during thermal
> > zone device registration, or upon unregistration of the device.
> >
> > In the error path, if device_register() was called, then `tz` disappears
> > before kfree(tz->tzp) happens, causing a NULL pointer deference crash.
> >
> > In my case, the error path was entered from the sbs power supply driver,
> > which through the power supply core registers a thermal zone *without
> > trip points* for the battery temperature sensor. This combined with
> > setting the default thermal governor to "power allocator", which
> > *requires* trip_max, causes the thermal zone registration to error out.
> >
> > The error path should handle the two cases, one where device_register
> > has not happened and the kobj hasn't been reference counted, and vice
> > versa where it has. The original commit tried to cover the first case,
> > but fails for the second. Fix this by adding kfree(tz->tzp) before
> > put_device() to cover the second case, and check if `tz` is still valid
> > before calling kfree(tz->tzp) to avoid crashing in the second case.
> >
> > Fixes: 3d439b1a2ad3 ("thermal/core: Alloc-copy-free the thermal zone parameters structure")
> > Signed-off-by: Chen-Yu Tsai <wenst@...omium.org>
> > ---
> > This includes the minimal changes to fix the crash. I suppose some other
> > things in the thermal core could be reworked:
> > - Don't use "power allocator" for thermal zones without trip points
> > - Move some of the thermal zone cleanup code into the release function
> >
> > drivers/thermal/thermal_core.c | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> > index 2415dc50c31d..e47826d82062 100644
> > --- a/drivers/thermal/thermal_core.c
> > +++ b/drivers/thermal/thermal_core.c
> > @@ -1392,12 +1392,16 @@ thermal_zone_device_register_with_trips(const char *type, struct thermal_trip *t
> > unregister:
> > device_del(&tz->device);
> > release_device:
> > + /* Free tz->tzp before tz goes away. */
> > + kfree(tz->tzp);
> > put_device(&tz->device);
> > tz = NULL;
> > remove_id:
> > ida_free(&thermal_tz_ida, id);
> > free_tzp:
> > - kfree(tz->tzp);
> > + /* If we arrived here before device_register() was called. */
> > + if (tz)
> > + kfree(tz->tzp);
> > free_tz:
> > kfree(tz);
> > return ERR_PTR(result);
> > --
>
> Can you please test linux-next from today? The issue addressed by
> your patch should be fixed there.
Sorry for the very late reply. Yes it does. Thanks.
ChenYu
Powered by blists - more mailing lists