linux-kernel - Re: [PATCH v5 6/7] arm64: use activity monitors for frequency invariance

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1da054d0-fef2-8a0b-b513-f2709b6d0208@arm.com>
Date:   Thu, 27 Feb 2020 08:43:50 +0000
From:   Lukasz Luba <lukasz.luba@....com>
To:     Ionela Voinescu <ionela.voinescu@....com>, catalin.marinas@....com,
        will@...nel.org, mark.rutland@....com, maz@...nel.org,
        suzuki.poulose@....com, sudeep.holla@....com,
        valentin.schneider@....com, dietmar.eggemann@....com,
        rjw@...ysocki.net, pkondeti@...eaurora.org
Cc:     peterz@...radead.org, mingo@...hat.com, vincent.guittot@...aro.org,
        viresh.kumar@...aro.org, linux-arm-kernel@...ts.infradead.org,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-pm@...r.kernel.org
Subject: Re: [PATCH v5 6/7] arm64: use activity monitors for frequency
 invariance



On 2/26/20 1:29 PM, Ionela Voinescu wrote:
> The Frequency Invariance Engine (FIE) is providing a frequency
> scaling correction factor that helps achieve more accurate
> load-tracking.
> 
> So far, for arm and arm64 platforms, this scale factor has been
> obtained based on the ratio between the current frequency and the
> maximum supported frequency recorded by the cpufreq policy. The
> setting of this scale factor is triggered from cpufreq drivers by
> calling arch_set_freq_scale. The current frequency used in computation
> is the frequency requested by a governor, but it may not be the
> frequency that was implemented by the platform.
> 
> This correction factor can also be obtained using a core counter and a
> constant counter to get information on the performance (frequency based
> only) obtained in a period of time. This will more accurately reflect
> the actual current frequency of the CPU, compared with the alternative
> implementation that reflects the request of a performance level from
> the OS.
> 
> Therefore, implement arch_scale_freq_tick to use activity monitors, if
> present, for the computation of the frequency scale factor.
> 
> The use of AMU counters depends on:
>   - CONFIG_ARM64_AMU_EXTN - depents on the AMU extension being present
>   - CONFIG_CPU_FREQ - the current frequency obtained using counter
>     information is divided by the maximum frequency obtained from the
>     cpufreq policy.
> 
> While it is possible to have a combination of CPUs in the system with
> and without support for activity monitors, the use of counters for
> frequency invariance is only enabled for a CPU if all related CPUs
> (CPUs in the same frequency domain) support and have enabled the core
> and constant activity monitor counters. In this way, there is a clear
> separation between the policies for which arch_set_freq_scale (cpufreq
> based FIE) is used, and the policies for which arch_scale_freq_tick
> (counter based FIE) is used to set the frequency scale factor. For
> this purpose, a late_initcall_sync is registered to trigger validation
> work for policies that will enable or disable the use of AMU counters
> for frequency invariance. If CONFIG_CPU_FREQ is not defined, the use
> of counters is enabled on all CPUs only if all possible CPUs correctly
> support the necessary counters.
> 
> Signed-off-by: Ionela Voinescu <ionela.voinescu@....com>
> Cc: Catalin Marinas <catalin.marinas@....com>
> Cc: Will Deacon <will@...nel.org>
> Cc: Sudeep Holla <sudeep.holla@....com>
> ---
>   arch/arm64/include/asm/topology.h |   9 ++
>   arch/arm64/kernel/cpufeature.c    |   4 +
>   arch/arm64/kernel/topology.c      | 180 ++++++++++++++++++++++++++++++
>   drivers/base/arch_topology.c      |  12 ++
>   include/linux/arch_topology.h     |   2 +
>   5 files changed, 207 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
> index a4d945db95a2..21d4d40d6243 100644
> --- a/arch/arm64/include/asm/topology.h
> +++ b/arch/arm64/include/asm/topology.h
> @@ -16,6 +16,15 @@ int pcibus_to_node(struct pci_bus *bus);
>   
>   #include <linux/arch_topology.h>
>   
> +#ifdef CONFIG_ARM64_AMU_EXTN
> +/*
> + * Replace task scheduler's default counter-based
> + * frequency-invariance scale factor setting.
> + */
> +void topology_scale_freq_tick(void);
> +#define arch_scale_freq_tick topology_scale_freq_tick
> +#endif /* CONFIG_ARM64_AMU_EXTN */
> +
>   /* Replace task scheduler's default frequency-invariant accounting */
>   #define arch_scale_freq_capacity topology_get_freq_scale
>   
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 60cebc071603..b8ec6c544d32 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1241,12 +1241,16 @@ bool cpu_has_amu_feat(int cpu)
>   	return cpumask_test_cpu(cpu, &amu_cpus);
>   }
>   
> +/* Initialize the use of AMU counters for frequency invariance */
> +extern void init_cpu_freq_invariance_counters(void);
> +
>   static void cpu_amu_enable(struct arm64_cpu_capabilities const *cap)
>   {
>   	if (has_cpuid_feature(cap, SCOPE_LOCAL_CPU)) {
>   		pr_info("detected CPU%d: Activity Monitors Unit (AMU)\n",
>   			smp_processor_id());
>   		cpumask_set_cpu(smp_processor_id(), &amu_cpus);
> +		init_cpu_freq_invariance_counters();
>   	}
>   }
>   
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index fa9528dfd0ce..0801a0f3c156 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -14,6 +14,7 @@
>   #include <linux/acpi.h>
>   #include <linux/arch_topology.h>
>   #include <linux/cacheinfo.h>
> +#include <linux/cpufreq.h>
>   #include <linux/init.h>
>   #include <linux/percpu.h>
>   
> @@ -120,4 +121,183 @@ int __init parse_acpi_topology(void)
>   }
>   #endif
>   
> +#ifdef CONFIG_ARM64_AMU_EXTN
>   
> +#undef pr_fmt
> +#define pr_fmt(fmt) "AMU: " fmt
> +
> +static DEFINE_PER_CPU_READ_MOSTLY(unsigned long, arch_max_freq_scale);
> +static DEFINE_PER_CPU(u64, arch_const_cycles_prev);
> +static DEFINE_PER_CPU(u64, arch_core_cycles_prev);
> +static cpumask_var_t amu_fie_cpus;
> +
> +/* Initialize counter reference per-cpu variables for the current CPU */
> +void init_cpu_freq_invariance_counters(void)
> +{
> +	this_cpu_write(arch_core_cycles_prev,
> +		       read_sysreg_s(SYS_AMEVCNTR0_CORE_EL0));
> +	this_cpu_write(arch_const_cycles_prev,
> +		       read_sysreg_s(SYS_AMEVCNTR0_CONST_EL0));
> +}
> +
> +static int validate_cpu_freq_invariance_counters(int cpu)
> +{
> +	u64 max_freq_hz, ratio;
> +
> +	if (!cpu_has_amu_feat(cpu)) {
> +		pr_debug("CPU%d: counters are not supported.\n", cpu);
> +		return -EINVAL;
> +	}
> +
> +	if (unlikely(!per_cpu(arch_const_cycles_prev, cpu) ||
> +		     !per_cpu(arch_core_cycles_prev, cpu))) {
> +		pr_debug("CPU%d: cycle counters are not enabled.\n", cpu);
> +		return -EINVAL;
> +	}
> +
> +	/* Convert maximum frequency from KHz to Hz and validate */
> +	max_freq_hz = cpufreq_get_hw_max_freq(cpu) * 1000;
> +	if (unlikely(!max_freq_hz)) {
> +		pr_debug("CPU%d: invalid maximum frequency.\n", cpu);
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * Pre-compute the fixed ratio between the frequency of the constant
> +	 * counter and the maximum frequency of the CPU.
> +	 *
> +	 *			      const_freq
> +	 * arch_max_freq_scale =   ---------------- * SCHED_CAPACITY_SCALE²
> +	 *			   cpuinfo_max_freq
> +	 *
> +	 * We use a factor of 2 * SCHED_CAPACITY_SHIFT -> SCHED_CAPACITY_SCALE²
> +	 * in order to ensure a good resolution for arch_max_freq_scale for
> +	 * very low arch timer frequencies (down to the KHz range which should
> +	 * be unlikely).
> +	 */
> +	ratio = (u64)arch_timer_get_rate() << (2 * SCHED_CAPACITY_SHIFT);
> +	ratio = div64_u64(ratio, max_freq_hz);
> +	if (!ratio) {
> +		WARN_ONCE(1, "System timer frequency too low.\n");
> +		return -EINVAL;
> +	}
> +
> +	per_cpu(arch_max_freq_scale, cpu) = (unsigned long)ratio;
> +
> +	return 0;
> +}
> +
> +static inline bool
> +enable_policy_freq_counters(int cpu, cpumask_var_t valid_cpus)
> +{
> +	struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
> +
> +	if (!policy) {
> +		pr_debug("CPU%d: No cpufreq policy found.\n", cpu);
> +		return false;
> +	}
> +
> +	if (cpumask_subset(policy->related_cpus, valid_cpus))
> +		cpumask_or(amu_fie_cpus, policy->related_cpus,
> +			   amu_fie_cpus);
> +
> +	cpufreq_cpu_put(policy);
> +
> +	return true;
> +}
> +
> +static DEFINE_STATIC_KEY_FALSE(amu_fie_key);
> +#define amu_freq_invariant() static_branch_unlikely(&amu_fie_key)
> +
> +static int __init init_amu_fie(void)
> +{
> +	cpumask_var_t valid_cpus;
> +	bool have_policy = false;
> +	int ret = 0;
> +	int cpu;
> +
> +	if (!zalloc_cpumask_var(&valid_cpus, GFP_KERNEL))
> +		return -ENOMEM;
> +
> +	if (!zalloc_cpumask_var(&amu_fie_cpus, GFP_KERNEL)) {
> +		ret = -ENOMEM;
> +		goto free_valid_mask;
> +	}
> +
> +	for_each_present_cpu(cpu) {
> +		if (validate_cpu_freq_invariance_counters(cpu))
> +			continue;
> +		cpumask_set_cpu(cpu, valid_cpus);
> +		have_policy |= enable_policy_freq_counters(cpu, valid_cpus);
> +	}
> +
> +	/*
> +	 * If we are not restricted by cpufreq policies, we only enable
> +	 * the use of the AMU feature for FIE if all CPUs support AMU.
> +	 * Otherwise, enable_policy_freq_counters has already enabled
> +	 * policy cpus.
> +	 */
> +	if (!have_policy && cpumask_equal(valid_cpus, cpu_present_mask))
> +		cpumask_or(amu_fie_cpus, amu_fie_cpus, valid_cpus);
> +
> +	if (!cpumask_empty(amu_fie_cpus)) {
> +		pr_info("CPUs[%*pbl]: counters will be used for FIE.",
> +			cpumask_pr_args(amu_fie_cpus));
> +		static_branch_enable(&amu_fie_key);
> +	}
> +
> +free_valid_mask:
> +	free_cpumask_var(valid_cpus);
> +
> +	return ret;
> +}
> +late_initcall_sync(init_amu_fie);
> +
> +bool arch_freq_counters_available(struct cpumask *cpus)
> +{
> +	return amu_freq_invariant() &&
> +	       cpumask_subset(cpus, amu_fie_cpus);
> +}
> +
> +void topology_scale_freq_tick(void)
> +{
> +	u64 prev_core_cnt, prev_const_cnt;
> +	u64 core_cnt, const_cnt, scale;
> +	int cpu = smp_processor_id();
> +
> +	if (!amu_freq_invariant())
> +		return;
> +
> +	if (!cpumask_test_cpu(cpu, amu_fie_cpus))
> +		return;
> +
> +	const_cnt = read_sysreg_s(SYS_AMEVCNTR0_CONST_EL0);
> +	core_cnt = read_sysreg_s(SYS_AMEVCNTR0_CORE_EL0);
> +	prev_const_cnt = this_cpu_read(arch_const_cycles_prev);
> +	prev_core_cnt = this_cpu_read(arch_core_cycles_prev);
> +
> +	if (unlikely(core_cnt <= prev_core_cnt ||
> +		     const_cnt <= prev_const_cnt))
> +		goto store_and_exit;
> +
> +	/*
> +	 *	    /\core    arch_max_freq_scale
> +	 * scale =  ------- * --------------------
> +	 *	    /\const   SCHED_CAPACITY_SCALE
> +	 *
> +	 * See validate_cpu_freq_invariance_counters() for details on
> +	 * arch_max_freq_scale and the use of SCHED_CAPACITY_SHIFT.
> +	 */
> +	scale = core_cnt - prev_core_cnt;
> +	scale *= this_cpu_read(arch_max_freq_scale);
> +	scale = div64_u64(scale >> SCHED_CAPACITY_SHIFT,
> +			  const_cnt - prev_const_cnt);
> +
> +	scale = min_t(unsigned long, scale, SCHED_CAPACITY_SCALE);
> +	this_cpu_write(freq_scale, (unsigned long)scale);
> +
> +store_and_exit:
> +	this_cpu_write(arch_core_cycles_prev, core_cnt);
> +	this_cpu_write(arch_const_cycles_prev, const_cnt);
> +}
> +#endif /* CONFIG_ARM64_AMU_EXTN */
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 6119e11a9f95..8d63673c1689 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -21,6 +21,10 @@
>   #include <linux/sched.h>
>   #include <linux/smp.h>
>   
> +__weak bool arch_freq_counters_available(struct cpumask *cpus)
> +{
> +	return false;
> +}
>   DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE;
>   
>   void arch_set_freq_scale(struct cpumask *cpus, unsigned long cur_freq,
> @@ -29,6 +33,14 @@ void arch_set_freq_scale(struct cpumask *cpus, unsigned long cur_freq,
>   	unsigned long scale;
>   	int i;
>   
> +	/*
> +	 * If the use of counters for FIE is enabled, just return as we don't
> +	 * want to update the scale factor with information from CPUFREQ.
> +	 * Instead the scale factor will be updated from arch_scale_freq_tick.
> +	 */
> +	if (arch_freq_counters_available(cpus))
> +		return;
> +
>   	scale = (cur_freq << SCHED_CAPACITY_SHIFT) / max_freq;
>   
>   	for_each_cpu(i, cpus)
> diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
> index 3015ecbb90b1..1ccdddb541a7 100644
> --- a/include/linux/arch_topology.h
> +++ b/include/linux/arch_topology.h
> @@ -33,6 +33,8 @@ unsigned long topology_get_freq_scale(int cpu)
>   	return per_cpu(freq_scale, cpu);
>   }
>   
> +bool arch_freq_counters_available(struct cpumask *cpus);
> +
>   struct cpu_topology {
>   	int thread_id;
>   	int core_id;
> 


Looks good

Reviewed-by: Lukasz Luba <lukasz.luba@....com>

Regards,
Lukasz