lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z7rm2XRqhCM8m9IU@gmail.com>
Date: Sun, 23 Feb 2025 10:14:01 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Qais Yousef <qyousef@...alina.io>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>,
	Viresh Kumar <viresh.kumar@...aro.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Juri Lelli <juri.lelli@...hat.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Christian Loehle <christian.loehle@....com>,
	Hongyan Xia <hongyan.xia2@....com>,
	John Stultz <jstultz@...gle.com>, Anjali K <anjalik@...ux.ibm.com>,
	linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v8] sched: Consolidate cpufreq updates


* Qais Yousef <qyousef@...alina.io> wrote:

> On 02/21/25 16:47, Ingo Molnar wrote:
> > 
> > * Qais Yousef <qyousef@...alina.io> wrote:
> > 
> > > ---
> > >  include/linux/sched/cpufreq.h    |   4 +-
> > >  kernel/sched/core.c              | 116 +++++++++++++++++++++++++++--
> > >  kernel/sched/cpufreq_schedutil.c | 122 +++++++++++++++++++------------
> > >  kernel/sched/deadline.c          |  10 ++-
> > >  kernel/sched/fair.c              |  84 +++++++++------------
> > >  kernel/sched/rt.c                |   8 +-
> > >  kernel/sched/sched.h             |   9 ++-
> > >  kernel/sched/syscalls.c          |  30 ++++++--
> > >  8 files changed, 266 insertions(+), 117 deletions(-)
> > 
> > The changelog is rather long, and the diffstat is non-trivial.
> > 
> > Could you please split this up into multiple patches?
> 
> Sure. I did consider that but what stopped me is that I couldn't see 
> how I could break them into independent patches. A lot of corner 
> cases needed to be addressed and if I moved them to their own patches 
> I'd potentially break bisectability of this code. If this is not a 
> problem then I can see how I can do a better split. If it is a 
> problem, I'll still try to think it over but it might require a bit 
> of stretching. But I admit I didn't try to think it over that hard.

Yeah, so bisectability should definitely be preserved.

I had a quick look, and these changes look fairly easy to split up to 
reduce size/complexity of individual patches. The following split looks 
pretty natural:

 # ============{ Preparatory changes with no change in functionality: }=========>

 [PATCH 1/9] Extend check_class_changed() with the 'class_changed' return bool
             # But don't use it at call sites yet

 [PATCH 2/9] Introduce and maintain the sugov_cpu::last_iowait_update metric
             # But don't use it yet

 [PATCH 3/9] Extend sugov_iowait_apply() with a 'flags' parameter
             # But don't use it yet internally

 [PATCH 4/9] Extend sugov_next_freq_shared() with the 'flags' parameter
             # But don't use it yet internally

 [PATCH 5/9] Clean up the enqueue_task_fair() control flow to make it easier to extend
             # This adds the goto restructuring but doesn't change functionality

 [PATCH 6/9] Introduce and maintain the cfs_rq::decayed flag
             # But don't use it yet

 [PATCH 7/8] Extend __setscheduler_uclamp() with the 'update_cpufreq' return bool
             # But don't use it yet

 # ============{ Behavioral changes: }==========>

 [PATCH 8/9] Change sugov_iowait_apply() behavior
 [PATCH 9/9] Change sugov_next_freq_shared() bahavior

 ... etc.

This is only a quick stab at the most trivial split-ups, it's not a 
complete list, and I saw other opportunities for split-up too. Please 
make these changes as finegrained as possible, as it changes behavior 
and there is a fair chance of behavioral regressions down the road - 
especially as the patch itself notes that even the new logic isn't 
perfect yet.

If the behavioral changes can be split into further steps, that would 
be preferable too.

Also:

 - Please make the rq->cfs.decayed logic unconditional on UP too, even 
   if it's not used. This reduces some of the ugly #ifdeffery AFAICS.

 - Please don't add prototypes for internal static functions like 
   __update_cpufreq_ctx_switch(), define the functions in the right 
   order instead.

 - Also, please read your comments and fix typos:

+                * This logic relied on PELT signal decays happening once every
+                * 1ms. But due to changes to how updates are done now, we can
+                * end up with more request coming up leading to iowait boost
+                * to be prematurely reduced. Make the assumption explicit
+                * until we improve the iowait boost logic to be better in
+                * general as it is due for an overhaul.

  s/request
   /requests

+        * We want to update cpufreq at context switch, but on systems with
+        * long TICK values, this can happen after a long time while more tasks
+        * would have been added meanwhile leaving us potentially running at
+        * inadequate frequency for extended period of time.

  Either 'an inadequate frequency' or 'inadequate frequencies'.

+        * This logic should only apply when new fair task was added to the
+        * CPU, we'd want to defer to context switch as much as possible, but
+        * to avoid the potential delays mentioned above, let's check if this
+        * additional tasks warrants sending an update sooner.

  s/when new fair task
   /when a new fair task

  s/this additional tasks
   /this additional task

(I haven't checked the spelling exhaustively, there might be more.)

Thanks,

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ