[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250209223433.symtjwbkcbwvhlc7@airbuntu>
Date: Sun, 9 Feb 2025 22:34:33 +0000
From: Qais Yousef <qyousef@...alina.io>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: Christian Loehle <christian.loehle@....com>,
Viresh Kumar <viresh.kumar@...aro.org>,
Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Juri Lelli <juri.lelli@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>,
Hongyan Xia <hongyan.xia2@....com>,
John Stultz <jstultz@...gle.com>, linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v7] sched: Consolidate cpufreq updates
On 09/12/24 13:33, Rafael J. Wysocki wrote:
> On Wed, Sep 11, 2024 at 10:34 PM Christian Loehle
> <christian.loehle@....com> wrote:
> >
> > On 7/28/24 19:45, Qais Yousef wrote:
> > > Improve the interaction with cpufreq governors by making the
> > > cpufreq_update_util() calls more intentional.
> > >
> > > At the moment we send them when load is updated for CFS, bandwidth for
> > > DL and at enqueue/dequeue for RT. But this can lead to too many updates
> > > sent in a short period of time and potentially be ignored at a critical
> > > moment due to the rate_limit_us in schedutil.
> > >
> > > For example, simultaneous task enqueue on the CPU where 2nd task is
> > > bigger and requires higher freq. The trigger to cpufreq_update_util() by
> > > the first task will lead to dropping the 2nd request until tick. Or
> > > another CPU in the same policy triggers a freq update shortly after.
> > >
> > > Updates at enqueue for RT are not strictly required. Though they do help
> > > to reduce the delay for switching the frequency and the potential
> > > observation of lower frequency during this delay. But current logic
> > > doesn't intentionally (at least to my understanding) try to speed up the
> > > request.
> > >
> > > To help reduce the amount of cpufreq updates and make them more
> > > purposeful, consolidate them into these locations:
> > >
> > > 1. context_switch()
> > > 2. task_tick_fair()
> > > 3. sched_balance_update_blocked_averages()
> > > 4. on sched_setscheduler() syscall that changes policy or uclamp values
> > > 5. on check_preempt_wakeup_fair() if wakeup preemption failed
> > > 6. on __add_running_bw() to guarantee DL bandwidth requirements.
> > >
> >
> > Actually now reading that code again reminded me, there is another
> > iowait boost change for intel_pstate.
> > intel_pstate has either intel_pstate_update_util() or
> > intel_pstate_update_util_hwp().
> > Both have
> > if (smp_processor_id() != cpu->cpu)
> > return;
> > Now since we move that update from enqueue to context_switch() that will
> > always be false.
> > I don't think that was deliberate but rather to simplify intel_pstate
> > synchronization, although !mcq device IO won't be boosted which you
> > could argue is good.
> > Just wanted to mention that, doesn't have to be a bad, but surely some
> > behavior change.
>
> This particular change shouldn't be problematic.
Thanks for checking and sorry for delayed response. Life got in the way and
couldn't get back to this sooner.
Cheers
--
Qais Yousef
Powered by blists - more mailing lists