lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130118200331.GF4062@pd.tnic>
Date:	Fri, 18 Jan 2013 21:03:31 +0100
From:	Borislav Petkov <bp@...en8.de>
To:	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
Cc:	Stefan Bader <stefan.bader@...onical.com>,
	Andre Przywara <andre@...rep.de>,
	"xen-devel@...ts.xensource.com" <xen-devel@...ts.xensource.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"Rafael J. Wysocki" <rjw@...k.pl>, Matthew Garrett <mjg@...hat.com>
Subject: Re: kernel 3.7+ cpufreq regression on AMD system running as dom0

On Fri, Jan 18, 2013 at 02:00:15PM -0500, Konrad Rzeszutek Wilk wrote:
> I did not explain myself well. The fix is OK - it just that the
> hypervisor causes the quirk to not work correctly. Hmm, I wonder if
> there BIOSes that do the same thing (cause the MSR to return 0). Per
> you estimation of BIOS quality, it seems that this could happen.

Yeah, I don't think there's a limit to the amount of SNAFU a BIOS can
cause :-).

> Oh, I was not thinking DMI per-say. I was thinking something similar to
> DMI-quirk API. But for the ACPI subsystem, so it would be:
> 
> 	if (ARM)
> 		... these quirks neccessary
> 	if (AMD)
> 		.. these quirks
> 
> and then the ACPI code can make the calls to this ACPI-quirk API to
> figure out whether it needs to modulate values. But this is all
> hand-waving at this point.

Yeah, those CPUs are just a very small set to even warrant a quirk API.

[ … ]

> Right, that information is gathered from the MSRs. I think the Xen would
> need to do this since it can do the MSRs correctly and modify the P-states.
> 
> So something like this in the hypervisor maybe (not even tested):

Yeah, something like that. Basically you can copy the quirk down to the
hypervisor.

But, Andre was explaining to me the other day that those P-states
frequencies are not that important.

Let me explain: the ondemand governor, for example, computes idle time
and each time it needs to increase, it switches straight up to the
highest frequency. When it decreases the freq. though, it goes down in a
staircase manner, going over all P-states, AFAICT.

So we use them but not for all decisions. The question is, what does the
xen governor(s) do?

If it only uses the frequencies for reporting, then it is not that big
of a deal. If it uses their values for switching decisions, then it
probably needs the correct ones.

> diff --git a/xen/arch/x86/acpi/cpufreq/powernow.c b/xen/arch/x86/acpi/cpufreq/powernow.c
> index a9b7792..54e7808 100644
> --- a/xen/arch/x86/acpi/cpufreq/powernow.c
> +++ b/xen/arch/x86/acpi/cpufreq/powernow.c
> @@ -146,7 +146,40 @@ static int powernow_cpufreq_target(struct cpufreq_policy *policy,
>  
>      return 0;
>  }
> +#define MSR_AMD_PSTATE_DEF_BASE     0xc0010064
> +static void amd_fixup_frequency(struct xen_processor_px *px, int i)
> +{
> +	u32 hi, lo, fid, did;
> +	int index = px->control & 0x00000007;
> +
> +	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
> +		return;
> +
> +	if ((boot_cpu_data.x86 == 0x10 && boot_cpu_data.x86_model < 10)
> +	    || boot_cpu_data.x86 == 0x11) {
> +		rdmsr(MSR_AMD_PSTATE_DEF_BASE + index, lo, hi);
> +        /* Bit 63 indicates whether contents are valid */
> +        if (!(hi & 0x80000000))
> +            return;

Something's funny with this indentation.

> +
> +		fid = lo & 0x3f;
> +		did = (lo >> 6) & 7;
> +		if (boot_cpu_data.x86 == 0x10)
> +			px->core_frequency = (100 * (fid + 0x10)) >> did;
> +		else
> +			px->core_frequency = (100 * (fid + 8)) >> did;
> +	}
> +}
> +
> +static void amd_fixup_freq(struct processor_performance *perf)
> +{
>  
> +    int i;
> +
> +    for (i = 0; i < perf->state_count; i++)
> +        amd_fixup_frequency(perf->states, i);
> +
> +}
>  static int powernow_cpufreq_verify(struct cpufreq_policy *policy)
>  {
>      struct acpi_cpufreq_data *data;
> @@ -158,6 +191,8 @@ static int powernow_cpufreq_verify(struct cpufreq_policy *policy)
>  
>      perf = &processor_pminfo[policy->cpu]->perf;
>  
> +    amd_fixup_freq(perf);
> +
>      cpufreq_verify_within_limits(policy, 0, 
>          perf->states[perf->platform_limit].core_frequency * 1000);

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ