linux-kernel - Re: Performance regression in 2.6.30-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20090602141606.GU5077@dirshya.in.ibm.com>
Date:	Tue, 2 Jun 2009 19:46:06 +0530
From:	Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>
To:	poornima nayak <mpnayak@...ux.vnet.ibm.com>
Cc:	linux-kernel@...r.kernel.org, venkatesh.pallipadi@...el.com,
	davej@...hat.com, ego@...ibm.com
Subject: Re: Performance regression in 2.6.30-rc1

* Poornima Nayak <mpnayak@...ux.vnet.ibm.com> [2009-06-02 16:30:19]:

> Hi 
> 
> By executing kernbench on 2.6.30-rc1 we observed there is a performance
> regression in 2.6.30-rc1. Then git-bisect was done between v2.6.29 and
> v2.6.30-rc5, after 13 iterations identified the attached patch is
> causing regression.
> 
> Performance data of 2.6.29 without applying the attached patch.
> param-version
> testname
> elapsed-avg
> elapsed-std
> 2.6.29'
> pm_kernbench.Version-none-threads=2-sched_mc=2
>              221.1
>               0.81
> 2.6.29'
> pm_kernbench.Version-none-threads=4-sched_mc=0
>             115.09
>                0.6
> 2.6.29'
> pm_kernbench.Version-none-threads=4-sched_mc=2
>             109.05
>               0.25
> 2.6.29'
> pm_kernbench.Version-none-threads=8-sched_mc=2
>               60.4
>               0.38
> 2.6.29'
> pm_kernbench.Version-none-threads=8-sched_mc=0
>              65.23
>               0.34
> 2.6.29'
> pm_kernbench.Version-none-threads=2-sched_mc=0
>             231.61
>               0.59
> 
> Performance data of 2.6.29 after applying the attached patch.
> param-version
> testname
> elapsed-avg
> elapsed-std
> 2.6.29'
> pm_kernbench.Version-thir-bisect-threads=2-sched_mc=0
>             203.77
>               0.48
> 2.6.29'
> pm_kernbench.Version-thir-bisect-threads=8-sched_mc=0
>              64.38
>               0.25
> 2.6.29'
> pm_kernbench.Version-thir-bisect-threads=4-sched_mc=0
>             102.46
>                0.1
> 2.6.29'
> pm_kernbench.Version-thir-bisect-threads=8-sched_mc=2
>              59.94
>               0.46
> 2.6.29'
> pm_kernbench.Version-thir-bisect-threads=4-sched_mc=2
>             106.84
>               0.28
> 2.6.29'
> pm_kernbench.Version-thir-bisect-threads=2-sched_mc=2
>             199.44
>               0.44
> 
> Performance issue here is when sched_mc_power_savings is set 2 and
> kernbench is triggered with 4 threads the value of 'elapsed time' is
> more then sched_mc_power_savings is set to 0. Expectation is elapsed
> time should be less when sched_mc_power_savings set 2 compared to
> sched_mc_power_savings set to 0.

Hi Poornima,

The table seems to be mangled.  Can you please resend and also sort
the results so that sched_mc=0,2 for the same number of threads come
together.  It is difficult to follow the results.

Also there seem to be a 10% improvement at each run level with the
patch.  So why are you claiming this as a performance regression?

sched_mc 2 over 0 is 4 sec more only in the 4 threaded case, but
overall improvement in other scenarios.

I assume you have run this on a 8 core box.

Also did you see this code being invoked on the test machine.  Did you
see the "Capping off P-state tranision latency" print.  This patch may
be affecting the ondemand governor, but I an unable to related this to
performance impact.

--Vaidy


> 
> Regds
> Poornima

> diff --git a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
> index 4b1c319..89c676d 100644
> --- a/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
> +++ b/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
> @@ -680,6 +680,18 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
>  			    perf->states[i].transition_latency * 1000;
>  	}
> 
> +	/* Check for high latency (>20uS) from buggy BIOSes, like on T42 */
> +	if (perf->control_register.space_id == ACPI_ADR_SPACE_FIXED_HARDWARE &&
> +	    policy->cpuinfo.transition_latency > 20 * 1000) {
> +		static int print_once;
> +		policy->cpuinfo.transition_latency = 20 * 1000;
> +		if (!print_once) {
> +			print_once = 1;
> +			printk(KERN_INFO "Capping off P-state tranision latency"
> +				" at 20 uS\n");
> +		}
> +	}
> +
>  	data->max_freq = perf->states[0].core_frequency * 1000;
>  	/* table init */
>  	for (i=0; i<perf->state_count; i++) {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/