lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <31c86834-273b-458f-9914-eff76c283cfb@arm.com>
Date: Tue, 17 Dec 2024 10:38:28 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: "Rafael J. Wysocki" <rjw@...ysocki.net>,
 Linux PM <linux-pm@...r.kernel.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Lukasz Luba <lukasz.luba@....com>,
 Peter Zijlstra <peterz@...radead.org>,
 Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
 Morten Rasmussen <morten.rasmussen@....com>,
 Vincent Guittot <vincent.guittot@...aro.org>,
 Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>,
 Pierre Gondois <pierre.gondois@....com>
Subject: Re: [RFC][PATCH v021 5/9] PM: EM: Introduce
 em_dev_expand_perf_domain()

On 29/11/2024 17:02, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> 
> Introduce a helper function for adding a CPU to an existing EM perf
> domain.
> 
> Subsequently, this will be used by the intel_pstate driver to add new
> CPUs to existing perf domains when those CPUs go online for the first
> time after the initialization of the driver.
> 
> No intentional functional impact.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> ---
> 
> v0.1 -> v0.2: No changes

Could you add information why this new EM interface is needed?

IIRC, you can't use the existing way (cpufreq_driver::register_em) since
it gets called to early (3) for the PD cpumasks to be ready. This issue
will be there for any system in which uarch domains are not congruent
with clock domains which we hadn't have to deal with Arm's heterogeneous
CPUs so far.

__init intel_pstate_init()

  intel_pstate_register_driver()

    cpufreq_register_driver()

      subsys_interface_register()

        sif->add_dev() -> cpufreq_add_dev()

          cpufreq_online()

            if (!new_policy && cpufreq_driver->online)

            else

              cpufreq_driver->init() -> intel_pstate_cpu_init()

                __intel_pstate_cpu_init()

                  intel_pstate_init_cpu()

                    intel_pstate_get_cpu_pstates()

                      hybrid_add_to_domain()

                        em_dev_expand_perf_domain()              <-- (1)

                  intel_pstate_init_acpi_perf_limits()

                    intel_pstate_set_itmt_prio()                 <-- (2)

            if (new_policy)

              cpufreq_driver->register_em()                      <-- (3)

    hybrid_init_cpu_capacity_scaling()

      hybrid_refresh_cpu_capacity_scaling()

        __hybrid_refresh_cpu_capacity_scaling()                  <-- (4)

        hybrid_register_all_perf_domains()

          hybrid_register_perf_domain()	

            em_dev_register_perf_domain()                        <-- (5)

      /* Enable EAS */
      sched_clear_itmt_support()                                 <-- (6)

Debugging this on a 'nosmt' i7-13700K (online mask =
[0,2,4,6,8,10,12,14,16-23]

(1) Add CPU to existing hybrid PD or create new hybrid PD.
(2) Triggers sched domain rebuild (+ enabling EAS) already here during
    startup ?
    IMHO, reason is that max_highest_perf > min_highest_perf because of
    different itmt prio
    Happens for CPU8 on my machine (after CPU8 is added to hybrid PD
    0,2,4,6,8) (itmt prio for CPU8=69 (1024) instead of 68 (1009)).
    So it looks like EAS is enabled before (6) ?	
(3) ARM's way to do (5)
(4) Setting hybrid_max_perf_cpu
(5) Register EM here
(6) Actual call to initially triggers sched domain rebuild (+ enable
    EAS) (done already in (2) on my machine)

So (3) is not possible for Intel hybrid since the policy's cpumask(s)
contain only one CPUs, i.e. CPUs are not sharing clock.
And those cpumasks have to be build under (1) to be used in (5)?

[...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ