lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPDyKFqyQuGC=ByxbDfJfFK_VRkwjTEQDXj1ket-51u+4_FYpw@mail.gmail.com>
Date: Tue, 4 Nov 2025 14:27:13 +0100
From: Ulf Hansson <ulf.hansson@...aro.org>
To: Dhruva Gole <d-gole@...com>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, linux-pm@...r.kernel.org, 
	Vincent Guittot <vincent.guittot@...aro.org>, Peter Zijlstra <peterz@...radead.org>, 
	Kevin Hilman <khilman@...libre.com>, Pavel Machek <pavel@...nel.org>, Len Brown <len.brown@...el.com>, 
	Daniel Lezcano <daniel.lezcano@...aro.org>, Saravana Kannan <saravanak@...gle.com>, 
	Maulik Shah <quic_mkshah@...cinc.com>, Prasad Sodagudi <psodagud@...cinc.com>, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/4] PM: QoS: Introduce a CPU system-wakeup QoS limit

On Fri, 31 Oct 2025 at 19:37, Dhruva Gole <d-gole@...com> wrote:
>
> On Oct 31, 2025 at 14:47:29 +0100, Ulf Hansson wrote:
> > [...]
> >
> > > >
> > > > > It seems an overkill to me that a userspace program be required to hold
> > > > > open this file just to make sure the constraints are honoured for the
> > > > > lifetime of the device. We should definitely give the freedom to just be
> > > > > able to echo and also be able to cat and read back from the same place
> > > > > about the latency constraint being set.
> > > >
> > > > So you'd want a sysfs attribute here, but that has its own issues (the
> > > > last writer "wins", so if there are multiple users of it with
> > > > different needs in user space, things get tricky).
> > >
> > > sysfs makes sense, then would it make sense to have something like a
> > > /sys/devices/system/cpu/cpu0/power/cpu_wakeup_latency entry?
> > >
> > > IMHO userspace should decide accordingly to manage it's users and how/whom to allow to
> > > set the latency constraint.
> > > We already have CPU latency QoS entry for example which is sysfs too.
> > >
> > > >
> > > > > One other thing on my mind is - and probably unrelated to this specific
> > > > > series, but I think we must have some sysfs entry either appear in
> > > > > /sys/.../cpu0/cpuidle or s2idle/ where we can show next feesible s2idle
> > > > > state that the governor has chosen to enter based on the value set in
> > > > > cpu_wakeup_latency.
> > > >
> > > > Exit latency values for all states are exposed via sysfs.  Since
> > > > s2idle always uses the deepest state it can use, it is quite
> > > > straightforward to figure out which of them will be used going
> > > > forward, given a specific latency constraint.
> > >
> > > I disagree regarding the straightforward part. There could be
> > > multiple domain heirarchy in a system for eg. and all these multiple
> > > domains would have their own set of domain-idle-states. All of them having their own
> > > entry, exit, and residency latencies. I myself while testing this series
> > > have been thoroughly confused at times what idle-state did the kernel
> > > actually pick this time, and had to add prints just to figure that out.
> >
> > If I understand correctly, most of that confusion is because of the
> > misunderstanding of including the residency in the state selection in
> > regards to QoS.
> >
> > Residency should not be accounted for, but only enter+exit latencies.
>
> Understood your point on the latencies, however the point remains that
> in a multi domain , multi idle-states case, do we really have an easy way to
> determine what the next choice of idle-state the governor is going to
> make? We don't even expose the entry exit latencies in sysfs btw...

I agree, we should extend the sysfs/debugfs information about the
domain-idle-states with this too. Especially since we already have it
for the regular idle states that are managed by cpuidle.

>
> >
> > >
> > > When implementing these things for the first
> > > time, especially when one has complex and many a domain idle-states it
> > > would indeed help alot if the kernel could just advertise somewhere what
> > > the governor is going to pick as the next s2idle state.
> >
> > The problem with advertising upfront is that the state selection is
> > done dynamically. It simply can't work.
>
> I understand it might be done dynamically, but as IIUC the only
> constraint being taken into account is really coming from userspace. I
> don't think this series is taking into account or even exposing any API
> to kernel world to modify the cpu wakeup latency (which I think you
> should, but that's an entirely orthogonal discussion, don't want to mix
> it here). So as far as "dynamic" is concerned I feel if the userspace is
> having control of which processes are setting the cpu wakeup constraints
> then it's entirely okay for kernel to tell userspace that at any given
> moment "this" is the next s2idle state I am going to pick if you do a
> system s2idle right now.
>
> >
> > >
> > > Also, I am not quite sure if these latencies are exposed in the
> > > domain-idle-states scenario ...
> > > I tried checking in /sys/kernel/debug/pm_genpd/XXX/ but I only see
> > > these:
> > > active_time  current_state  devices  idle_states  sub_domains  total_idle_time
> > >
> > > Maybe an additional s2idle_state or something appearing here is what I
> > > was inclined toward.
> >
> > That sounds like an idea that is worth exploring, if what you are
> > suggesting is to extend the idle state statistics. In principle we
> > want a new counter per idle state that indicates the number of times
> > we entered this state in s2idle, right?
>
> Absolutely, having a "global" kind of a place to find out the s2idle
> stats would really be useful.

For regular idle states that are managed by cpuidle, those have a
per-state directory called s2idle (if the state is supported for
s2idle), with usage/time counters.

That said, I agree, it's a good idea to add something similar for the
domain-idle-states that are managed by genpd.

Let me think about it and I will post a couple of patches that add
this information about the domain-idle-states.

Kind regards
Uffe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ