[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJZ5v0jus4bzeZhUK4WC7uypQkh-_MMuU1M54figsGV3+5OhUg@mail.gmail.com>
Date: Tue, 17 Dec 2024 21:40:54 +0100
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: "Rafael J. Wysocki" <rjw@...ysocki.net>, Linux PM <linux-pm@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>, Lukasz Luba <lukasz.luba@....com>,
Peter Zijlstra <peterz@...radead.org>,
Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
Morten Rasmussen <morten.rasmussen@....com>, Vincent Guittot <vincent.guittot@...aro.org>,
Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>,
Pierre Gondois <pierre.gondois@....com>
Subject: Re: [RFC][PATCH v021 5/9] PM: EM: Introduce em_dev_expand_perf_domain()
On Tue, Dec 17, 2024 at 10:38 AM Dietmar Eggemann
<dietmar.eggemann@....com> wrote:
>
> On 29/11/2024 17:02, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> >
> > Introduce a helper function for adding a CPU to an existing EM perf
> > domain.
> >
> > Subsequently, this will be used by the intel_pstate driver to add new
> > CPUs to existing perf domains when those CPUs go online for the first
> > time after the initialization of the driver.
> >
> > No intentional functional impact.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> > ---
> >
> > v0.1 -> v0.2: No changes
>
> Could you add information why this new EM interface is needed?
There is some of it in the changelog already.
In fact, it is only needed in a corner case when the system starts
with some CPUs offline and they only go online later (as a result of
an explicit request from user space). That is the only case when a
new CPU may need to be added to an existing perf domain.
> IIRC, you can't use the existing way (cpufreq_driver::register_em) since
> it gets called to early (3) for the PD cpumasks to be ready. This issue
> will be there for any system in which uarch domains are not congruent
> with clock domains which we hadn't have to deal with Arm's heterogeneous
> CPUs so far.
Basically, yes.
> __init intel_pstate_init()
>
> intel_pstate_register_driver()
>
> cpufreq_register_driver()
>
> subsys_interface_register()
>
> sif->add_dev() -> cpufreq_add_dev()
>
> cpufreq_online()
>
> if (!new_policy && cpufreq_driver->online)
>
> else
>
> cpufreq_driver->init() -> intel_pstate_cpu_init()
>
> __intel_pstate_cpu_init()
>
> intel_pstate_init_cpu()
>
> intel_pstate_get_cpu_pstates()
>
> hybrid_add_to_domain()
>
> em_dev_expand_perf_domain() <-- (1)
>
> intel_pstate_init_acpi_perf_limits()
>
> intel_pstate_set_itmt_prio() <-- (2)
>
> if (new_policy)
>
> cpufreq_driver->register_em() <-- (3)
>
> hybrid_init_cpu_capacity_scaling()
>
> hybrid_refresh_cpu_capacity_scaling()
>
> __hybrid_refresh_cpu_capacity_scaling() <-- (4)
>
> hybrid_register_all_perf_domains()
>
> hybrid_register_perf_domain()
>
> em_dev_register_perf_domain() <-- (5)
>
> /* Enable EAS */
> sched_clear_itmt_support() <-- (6)
>
> Debugging this on a 'nosmt' i7-13700K (online mask =
> [0,2,4,6,8,10,12,14,16-23]
>
> (1) Add CPU to existing hybrid PD or create new hybrid PD.
Not exactly.
(1) is just to expand an existing perf domain if the CPU is new (was
offline all the time previously).
Likewise, the direct hybrid_register_perf_domain() call in
hybrid_add_to_domain() is to add a new perf domain if the given CPU is
new (was offline all the time previously) and is the first one of the
given type (say, the system is starting with all E-cores offline).
It won't succeed before (4).
For CPUs that are online to start with, hybrid_add_to_domain() assigns
them to hybrid domains and PDs are created for them in
hybrid_register_all_perf_domains().
> (2) Triggers sched domain rebuild (+ enabling EAS) already here during
> startup ?
This is just to enable ITMT which is the default mechanism for Intel
hybrid platforms. It also requires a rebuild of sched domains to be
enabled.
> IMHO, reason is that max_highest_perf > min_highest_perf because of
> different itmt prio
Yes (which means that the platform is at least not homogenous).
This really has been introduced for the handling of favored cores on
otherwise non-hybrid platforms (say Tiger Lake).
> Happens for CPU8 on my machine (after CPU8 is added to hybrid PD
> 0,2,4,6,8) (itmt prio for CPU8=69 (1024) instead of 68 (1009)).
> So it looks like EAS is enabled before (6) ?
No, it is ITMT because CPU8 is a favored core.
> (3) ARM's way to do (5)
> (4) Setting hybrid_max_perf_cpu
> (5) Register EM here
> (6) Actual call to initially triggers sched domain rebuild (+ enable
> EAS) (done already in (2) on my machine)
This is the second rebuild of sched domains to turn off ITMT and
enable EAS. The previous one is to enable ITMT.
The earlier enabling of ITMT could be avoided I think, but that would
be a complication on platforms that contain favored cores but
otherwise are not hybrid.
> So (3) is not possible for Intel hybrid since the policy's cpumask(s)
It is possible in principle, but not particularly convenient because
at that point it is not clear whether or not the platform is really
hybrid and SMT is off and so whether or not EAS is to be used.
> contain only one CPUs, i.e. CPUs are not sharing clock.
> And those cpumasks have to be build under (1) to be used in (5)?
They are built by the caller of (1) to be precise, but yes.
Powered by blists - more mailing lists