lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKXuJqjAQERxkfTESUKSvPo3N5qOf+un6LbKys9YrBT_ocJG8A@mail.gmail.com>
Date: Mon, 24 Jun 2024 14:18:12 -0500
From: Steev Klimaszewski <steev@...i.org>
To: "Rafael J. Wysocki" <rjw@...ysocki.net>
Cc: Linux PM <linux-pm@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>, 
	Daniel Lezcano <daniel.lezcano@...aro.org>, Lukasz Luba <lukasz.luba@....com>, 
	Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>, Zhang Rui <rui.zhang@...el.com>, 
	Jens Glathe <jens.glathe@...schoolsolutions.biz>, Johan Hovold <johan+linaro@...nel.org>
Subject: Re: [PATCH v1] thermal: gov_step_wise: Go straight to instance->lower
 when mitigation is over

Hi Rafael,

On Sat, Jun 22, 2024 at 7:28 AM Rafael J. Wysocki <rjw@...ysocki.net> wrote:
>
> From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
>
> Commit b6846826982b ("thermal: gov_step_wise: Restore passive polling
> management") attempted to fix a Step-Wise thermal governor issue
> introduced by commit 042a3d80f118 ("thermal: core: Move passive polling
> management to the core"), which caused the governor to leave cooling
> devices in high states, by partially revering that commit.
>
> However, this turns out to be insufficient on some systems due to
> interactions between the governor code restored by commit b6846826982b
> and the passive polling management in the thermal core.
>
> For this reason, revert commit b6846826982b and make the governor set
> the target cooling device state to the "lower" one as soon as the zone
> temperature falls below the threshold of the trip point corresponding
> to the given thermal instance, which means that thermal mitigation is
> not necessary any more.
>
> Before this change the "lower" cooling device state would be reached in
> steps through the passive polling mechanism which was questionable for
> three reasons: (1) cooling device were kept in high states when that was
> not necessary (and it could adversely impact performance), (2) it only
> worked for thermal zones with nonzero passive_delay_jiffies value, and
> (3) passive polling belongs to the core and should not be hijacked by
> governors for their internal purposes.
>
> Fixes: b6846826982b ("thermal: gov_step_wise: Restore passive polling management")
> Closes: https://lore.kernel.org/linux-pm/6759ce9f-281d-4fcd-bb4c-b784a1cc5f6e@oldschoolsolutions.biz
> Reported-by: Jens Glathe <jens.glathe@...schoolsolutions.biz>
> Tested-by: Jens Glathe <jens.glathe@...schoolsolutions.biz>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> ---
>  drivers/thermal/gov_step_wise.c |   23 +++++------------------
>  1 file changed, 5 insertions(+), 18 deletions(-)
>
> Index: linux-pm/drivers/thermal/gov_step_wise.c
> ===================================================================
> --- linux-pm.orig/drivers/thermal/gov_step_wise.c
> +++ linux-pm/drivers/thermal/gov_step_wise.c
> @@ -55,7 +55,11 @@ static unsigned long get_target_state(st
>                 if (cur_state <= instance->lower)
>                         return THERMAL_NO_TARGET;
>
> -               return clamp(cur_state - 1, instance->lower, instance->upper);
> +               /*
> +                * If 'throttle' is false, no mitigation is necessary, so
> +                * request the lower state for this instance.
> +                */
> +               return instance->lower;
>         }
>
>         return instance->target;
> @@ -93,23 +97,6 @@ static void thermal_zone_trip_update(str
>                 if (instance->initialized && old_target == instance->target)
>                         continue;
>
> -               if (trip->type == THERMAL_TRIP_PASSIVE) {
> -                       /*
> -                        * If the target state for this thermal instance
> -                        * changes from THERMAL_NO_TARGET to something else,
> -                        * ensure that the zone temperature will be updated
> -                        * (assuming enabled passive cooling) until it becomes
> -                        * THERMAL_NO_TARGET again, or the cooling device may
> -                        * not be reset to its initial state.
> -                        */
> -                       if (old_target == THERMAL_NO_TARGET &&
> -                           instance->target != THERMAL_NO_TARGET)
> -                               tz->passive++;
> -                       else if (old_target != THERMAL_NO_TARGET &&
> -                                instance->target == THERMAL_NO_TARGET)
> -                               tz->passive--;
> -               }
> -
>                 instance->initialized = true;
>
>                 mutex_lock(&instance->cdev->lock);
>
>
>
I've tested this here against 6.10.0-rc5 and it looks like it does
what it says on the tin now.  Locks to low speeds at thermal and
(rather quickly) unlocks once it's no longer in the "danger zone"

Tested-by: Steev Klimaszewski <steev@...i.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ