lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <80f47262-9354-472f-8122-5ae262c0a46d@quicinc.com>
Date:   Thu, 3 Aug 2023 09:48:37 +0530
From:   Pavan Kondeti <quic_pkondeti@...cinc.com>
To:     David Dai <davidai@...gle.com>
CC:     "Rafael J. Wysocki" <rafael@...nel.org>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Rob Herring <robh+dt@...nel.org>,
        "Krzysztof Kozlowski" <krzysztof.kozlowski+dt@...aro.org>,
        Conor Dooley <conor+dt@...nel.org>,
        Sudeep Holla <sudeep.holla@....com>,
        Saravana Kannan <saravanak@...gle.com>,
        Quentin Perret <qperret@...gle.com>,
        Masami Hiramatsu <mhiramat@...gle.com>,
        Will Deacon <will@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        "Marc Zyngier" <maz@...nel.org>,
        Oliver Upton <oliver.upton@...ux.dev>,
        "Dietmar Eggemann" <dietmar.eggemann@....com>,
        Pavan Kondeti <quic_pkondeti@...cinc.com>,
        Gupta Pankaj <pankaj.gupta@....com>,
        Mel Gorman <mgorman@...e.de>, <kernel-team@...roid.com>,
        <linux-pm@...r.kernel.org>, <devicetree@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 2/2] cpufreq: add virtual-cpufreq driver

On Mon, Jul 31, 2023 at 10:46:09AM -0700, David Dai wrote:
> Introduce a virtualized cpufreq driver for guest kernels to improve
> performance and power of workloads within VMs.
> 
> This driver does two main things:
> 
> 1. Sends the frequency of vCPUs as a hint to the host. The host uses the
> hint to schedule the vCPU threads and decide physical CPU frequency.
> 
> 2. If a VM does not support a virtualized FIE(like AMUs), it queries the
> host CPU frequency by reading a MMIO region of a virtual cpufreq device
> to update the guest's frequency scaling factor periodically. This enables
> accurate Per-Entity Load Tracking for tasks running in the guest.
> 
> Co-developed-by: Saravana Kannan <saravanak@...gle.com>
> Signed-off-by: Saravana Kannan <saravanak@...gle.com>
> Signed-off-by: David Dai <davidai@...gle.com>

[...]

> +static void virt_scale_freq_tick(void)
> +{
> +	struct cpufreq_policy *policy = cpufreq_cpu_get(smp_processor_id());
> +	struct virt_cpufreq_drv_data *data = policy->driver_data;
> +	u32 max_freq = (u32)policy->cpuinfo.max_freq;
> +	u64 cur_freq;
> +	u64 scale;
> +
> +	cpufreq_cpu_put(policy);
> +
> +	cur_freq = (u64)data->ops->get_freq(policy);
> +	cur_freq <<= SCHED_CAPACITY_SHIFT;
> +	scale = div_u64(cur_freq, max_freq);
> +
> +	this_cpu_write(arch_freq_scale, (unsigned long)scale);
> +}
> +

We expect the host to provide the frequency in kHz, can you please add a
comment about it. It is not very obvious when you look at the
REG_CUR_FREQ_OFFSET register name.

> +static struct scale_freq_data virt_sfd = {
> +	.source = SCALE_FREQ_SOURCE_VIRT,
> +	.set_freq_scale = virt_scale_freq_tick,
> +};
> +
> +static unsigned int virt_cpufreq_set_perf(struct cpufreq_policy *policy)
> +{
> +	struct virt_cpufreq_drv_data *data = policy->driver_data;
> +	/*
> +	 * Use cached frequency to avoid rounding to freq table entries
> +	 * and undo 25% frequency boost applied by schedutil.
> +	 */
> +	u32 freq = mult_frac(policy->cached_target_freq, 80, 100);
> +
> +	data->ops->set_freq(policy, freq);
> +	return 0;
> +}

Why do we undo the frequency boost? A governor may apply other boosts
like RT (uclamp), iowait. It is not clear why we need to worry about
governor policies here.

> +
> +static unsigned int virt_cpufreq_fast_switch(struct cpufreq_policy *policy,
> +		unsigned int target_freq)
> +{
> +	virt_cpufreq_set_perf(policy);
> +	return target_freq;
> +}
> +
> +static int virt_cpufreq_target_index(struct cpufreq_policy *policy,
> +		unsigned int index)
> +{
> +	return virt_cpufreq_set_perf(policy);
> +}
> +
> +static int virt_cpufreq_cpu_init(struct cpufreq_policy *policy)
> +{
> +	struct virt_cpufreq_drv_data *drv_data = cpufreq_get_driver_data();
> +	struct cpufreq_frequency_table *table;
> +	struct device *cpu_dev;
> +	int ret;
> +
> +	cpu_dev = get_cpu_device(policy->cpu);
> +	if (!cpu_dev)
> +		return -ENODEV;
> +
> +	ret = dev_pm_opp_of_add_table(cpu_dev);
> +	if (ret)
> +		return ret;
> +
> +	ret = dev_pm_opp_get_opp_count(cpu_dev);
> +	if (ret <= 0) {
> +		dev_err(cpu_dev, "OPP table can't be empty\n");
> +		return -ENODEV;
> +	}
> +
> +	ret = dev_pm_opp_init_cpufreq_table(cpu_dev, &table);
> +	if (ret) {
> +		dev_err(cpu_dev, "failed to init cpufreq table: %d\n", ret);
> +		return ret;
> +	}
> +
> +	policy->freq_table = table;
> +	policy->dvfs_possible_from_any_cpu = false;
> +	policy->fast_switch_possible = true;
> +	policy->driver_data = drv_data;
> +
> +	/*
> +	 * Only takes effect if another FIE source such as AMUs
> +	 * have not been registered.
> +	 */
> +	topology_set_scale_freq_source(&virt_sfd, policy->cpus);
> +
> +	return 0;
> +
> +}
> +

Do we need to register as FIE source even with the below commit? By
registering as a source, we are not supplying any accurate metric. We
still fallback on the same source that cpufreq implements it.

874f63531064 ("cpufreq: report whether cpufreq supports Frequency
Invariance (FI)")

Thanks,
Pavan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ