linux-kernel - Re: [PATCH v1 1/4] cpuidle: teo: Add polling flag check to early return path

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAJZ5v0gXVYQM5EcQPfEKfLwjoj4yEFF1Lh2q1oXvuFmRyL5JGg@mail.gmail.com>
Date: Thu, 23 Jan 2025 19:12:27 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: dedekind1@...il.com
Cc: Christian Loehle <christian.loehle@....com>, "Rafael J. Wysocki" <rjw@...ysocki.net>, 
	Linux PM <linux-pm@...r.kernel.org>, Aboorva Devarajan <aboorvad@...ux.ibm.com>, 
	LKML <linux-kernel@...r.kernel.org>, Daniel Lezcano <daniel.lezcano@...aro.org>
Subject: Re: [PATCH v1 1/4] cpuidle: teo: Add polling flag check to early
 return path

On Thu, Jan 23, 2025 at 5:54 PM Artem Bityutskiy <dedekind1@...il.com> wrote:
>
> On Fri, 2025-01-10 at 13:16 +0000, Christian Loehle wrote:
> > This would then enable intercept-detection only for <50% of the time,
> > another option is to not allow intercepts selecting a polling state, but
> > there were recent complaints about this exact behavior from Aboorva (+TO).
> > They don't have a low-latency non-polling state.
>
> What folks think about the following idea.
>
> Idle governor algorithm essentially predicts the sleep time (step 1), based on
> that, the idle state gets selected (step 2).

No, it's kind of the reverse nowadays.

Step 1 is to preselect an idle state using the idle duration
information from the recent past and step 2 is to refine the selection
using the sleep time information (unless step 1 by itself is
sufficient).

In particular, teo doesn't attempt to predict the idle duration.  It
basically picks up the deepest idle state that wouldn't have been too
deep in the majority of cases taken into account.

> What if we add a sleep time factor and expose it via sysfs. The predicted sleep
> time would be multiplied by the factor (between steps 1 and 2).
>
> Default factor value is 1 (or 100%). If users want teo be more hesitant
> selecting deeper idle states, they can set it to 0.5 (or 50%) or some other
> value < 1. I users want teo to be more enthusiastic about selecting deeper idle
> states, they set the factor to a value > 1.
>
> We could expose it via sysfs and allow changing in some reasonable range.
>
> The idea is to let users adjust the level of idle governor (teo in this case)
> enthusiasm regarding deep C-state.

For teo, that value from user space would effectively work as an upper
boundary for the target residency of idle states to select and there
is not too much difference between this and the existing PM (CPU
latency) QoS (limiting target residency is equivalent to limiting exit
latency).

If you want a "latency bias" knob, for teo it might be something
related to what is regarded as a "sufficient majority".  Right now it
is hardcoded to "over 50%", but it may be a different fraction in
principle ((e.g. 5/8, 2/3 etc.) and it may be tunable within some
limits.