linux-kernel - Re: [PATCH v1] cpuidle: menu: Update documentation after previous changes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <fb5db9b4-397e-4a21-b145-f19758fa35dc@arm.com>
Date: Fri, 10 Jan 2025 14:19:43 +0000
From: Christian Loehle <christian.loehle@....com>
To: "Rafael J. Wysocki" <rjw@...ysocki.net>,
 Linux PM <linux-pm@...r.kernel.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
 Daniel Lezcano <daniel.lezcano@...aro.org>,
 Artem Bityutskiy <artem.bityutskiy@...ux.intel.com>
Subject: Re: [PATCH v1] cpuidle: menu: Update documentation after previous
 changes

On 1/10/25 12:46, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> 
> After commit 38f83090f515 ("cpuidle: menu: Remove iowait influence") and
> other previous changes, the description of the menu governor in the
> documentation does not match the code any more, so update it as
> appropriate.
> 
> Fixes: 38f83090f515 ("cpuidle: menu: Remove iowait influence")
> Fixes: 5484e31bbbff ("cpuidle: menu: Skip tick_nohz_get_sleep_length() call in some cases")
> Reported-by: Artem Bityutskiy <artem.bityutskiy@...ux.intel.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> ---
>  Documentation/admin-guide/pm/cpuidle.rst |   72 +++++++++++++------------------
>  1 file changed, 31 insertions(+), 41 deletions(-)
> 
> --- a/Documentation/admin-guide/pm/cpuidle.rst
> +++ b/Documentation/admin-guide/pm/cpuidle.rst
> @@ -269,27 +269,7 @@
>  the CPU will ask the processor hardware to enter), it attempts to predict the
>  idle duration and uses the predicted value for idle state selection.
>  
> -It first obtains the time until the closest timer event with the assumption
> -that the scheduler tick will be stopped.  That time, referred to as the *sleep
> -length* in what follows, is the upper bound on the time before the next CPU
> -wakeup.  It is used to determine the sleep length range, which in turn is needed
> -to get the sleep length correction factor.
> -
> -The ``menu`` governor maintains two arrays of sleep length correction factors.
> -One of them is used when tasks previously running on the given CPU are waiting
> -for some I/O operations to complete and the other one is used when that is not
> -the case.  Each array contains several correction factor values that correspond
> -to different sleep length ranges organized so that each range represented in the
> -array is approximately 10 times wider than the previous one.
> -
> -The correction factor for the given sleep length range (determined before
> -selecting the idle state for the CPU) is updated after the CPU has been woken
> -up and the closer the sleep length is to the observed idle duration, the closer
> -to 1 the correction factor becomes (it must fall between 0 and 1 inclusive).
> -The sleep length is multiplied by the correction factor for the range that it
> -falls into to obtain the first approximation of the predicted idle duration.
> -
> -Next, the governor uses a simple pattern recognition algorithm to refine its
> +It first uses a simple pattern recognition algorithm to obtain a preliminary
>  idle duration prediction.  Namely, it saves the last 8 observed idle duration
>  values and, when predicting the idle duration next time, it computes the average
>  and variance of them.  If the variance is small (smaller than 400 square
> @@ -301,29 +281,39 @@
>  taken as the "typical interval" value and so on, until either the "typical
>  interval" is determined or too many data points are disregarded, in which case
>  the "typical interval" is assumed to equal "infinity" (the maximum unsigned
> -integer value).  The "typical interval" computed this way is compared with the
> -sleep length multiplied by the correction factor and the minimum of the two is
> -taken as the predicted idle duration.
> -
> -Then, the governor computes an extra latency limit to help "interactive"
> -workloads.  It uses the observation that if the exit latency of the selected
> -idle state is comparable with the predicted idle duration, the total time spent
> -in that state probably will be very short and the amount of energy to save by
> -entering it will be relatively small, so likely it is better to avoid the
> -overhead related to entering that state and exiting it.  Thus selecting a
> -shallower state is likely to be a better option then.   The first approximation
> -of the extra latency limit is the predicted idle duration itself which
> -additionally is divided by a value depending on the number of tasks that
> -previously ran on the given CPU and now they are waiting for I/O operations to
> -complete.  The result of that division is compared with the latency limit coming
> -from the power management quality of service, or `PM QoS <cpu-pm-qos_>`_,
> -framework and the minimum of the two is taken as the limit for the idle states'
> -exit latency.
> +integer value).
> +
> +If the "typical interval" computed this way is long enough, the governor obtains
> +the time until the closest timer event with the assumption that the scheduler
> +tick will be stopped.  That time, referred to as the *sleep length* in what follows,
> +is the upper bound on the time before the next CPU wakeup.  It is used to determine
> +the sleep length range, which in turn is needed to get the sleep length correction
> +factor.

The sleep length of course isn't really an upper bound before the next CPU wakeup,
we just treat it as such, but I guess the doc doesn't need to be too pedantic.

> +
> +The ``menu`` governor maintains an array containing several correction factor
> +values that correspond to different sleep length ranges organized so that each
> +range represented in the array is approximately 10 times wider than the previous
> +one.
> +
> +The correction factor for the given sleep length range (determined before
> +selecting the idle state for the CPU) is updated after the CPU has been woken
> +up and the closer the sleep length is to the observed idle duration, the closer
> +to 1 the correction factor becomes (it must fall between 0 and 1 inclusive).
> +The sleep length is multiplied by the correction factor for the range that it
> +falls into to obtain an approximation of the predicted idle duration that is
> +compared to the "typical interval" determined previously and the minimum of
> +the two is taken as the idle duration prediction.
> +
> +If the "typical interval" value is small, which means that the CPU is likely
> +to be woken up soon enough, the sleep length computation is skipped as it may
> +be costly and the idle duration is simply predicted to equal the "typical
> +interval" value.
>  
>  Now, the governor is ready to walk the list of idle states and choose one of
>  them.  For this purpose, it compares the target residency of each state with
> -the predicted idle duration and the exit latency of it with the computed latency
> -limit.  It selects the state with the target residency closest to the predicted
> +the predicted idle duration and the exit latency of it with the with the latency
> +limit coming from the power management quality of service, or `PM QoS <cpu-pm-qos_>`_,
> +framework.  It selects the state with the target residency closest to the predicted
>  idle duration, but still below it, and exit latency that does not exceed the
>  limit.

Reviewed-by: Christian Loehle <christian.loehle@....com>