linux-kernel - Re: [tip:sched/core] sched: cpuacct: Use bigger percpu counter batch values for stats counters

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090716081010.GB3134@in.ibm.com>
Date:	Thu, 16 Jul 2009 13:40:10 +0530
From:	Bharata B Rao <bharata@...ux.vnet.ibm.com>
To:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Cc:	Ingo Molnar <mingo@...e.hu>,
	Balbir Singh <balbir@...ux.vnet.ibm.com>, mingo@...hat.com,
	hpa@...or.com, linux-kernel@...r.kernel.org,
	a.p.zijlstra@...llo.nl, schwidefsky@...ibm.com,
	balajirrao@...il.com, dhaval@...ux.vnet.ibm.com,
	tglx@...utronix.de, kamezawa.hiroyu@...fujitsu.com,
	linux-tip-commits@...r.kernel.org,
	Anton Blanchard <anton@....ibm.com>
Subject: Re: [tip:sched/core] sched: cpuacct: Use bigger percpu counter
	batch values for stats counters

On Tue, May 12, 2009 at 07:44:37PM +0900, KOSAKI Motohiro wrote:
> ------------------------------------
> Subject: [PATCH] sched: cpuacct: Use bigger percpu counter batch values for stats counters
> 
> percpu counters used to accumulate statistics in cpuacct controller use
> the default batch value [max(2*nr_cpus, 32)] which can be too small for
> archs that define VIRT_CPU_ACCOUNTING. In such archs, a tick could result in
> cputime updates in the range of thousands. As a result, cpuacct_update_stats()
> would end up acquiring the percpu counter spinlock on every tick which
> is not good for performance.
> 
> Let those architectures to have a bigger batch threshold so that percpu counter
> spinlock isn't taken on every tick. This change doesn't affect the archs which
> don't define VIRT_CPU_ACCOUNTING and they continue to have the default
> percpu counter batch value.
> 
>  v7:
>  - fix typo and changelog
> 
>  v6:
>  - fix build error when UP
> 
>  v5:
>  - move cpuacct_batch initialization into sched_init()
> 
>  v4:
>   - rewrite patch description (thanks Bharata!)
>   - append read_mostly to cpuacct_batch
>   - cpuacct_batch is initialized by sched_init_debug()
> 
>  v3:
>   - revert using percpu_counter_sum()
> 
>  v2:
>   - use percpu_counter_sum() instead percpu_counter_read()
> 
> Cc: Balaji Rao <balajirrao@...il.com>
> Cc: Dhaval Giani <dhaval@...ux.vnet.ibm.com>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
> Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> Cc: Balbir Singh <balbir@...ux.vnet.ibm.com>
> Cc: Ingo Molnar <mingo@...e.hu>
> Cc: Martin Schwidefsky <schwidefsky@...ibm.com>
> Signed-off-by: Bharata B Rao <bharata@...ux.vnet.ibm.com>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
> ---
>  kernel/sched.c |    9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> Index: b/kernel/sched.c
> ===================================================================
> --- a/kernel/sched.c	2009-05-12 13:12:59.000000000 +0900
> +++ b/kernel/sched.c	2009-05-12 19:04:49.000000000 +0900
> @@ -870,6 +870,8 @@ static __read_mostly int scheduler_runni
>   */
>  int sysctl_sched_rt_runtime = 950000;
> 
> +static __read_mostly s32 cpuacct_batch;
> +
>  static inline u64 global_rt_period(void)
>  {
>  	return (u64)sysctl_sched_rt_period * NSEC_PER_USEC;
> @@ -9284,6 +9286,10 @@ void __init sched_init(void)
> 
>  	perf_counter_init();
> 
> +#ifdef CONFIG_SMP
> +	cpuacct_batch = jiffies_to_cputime(percpu_counter_batch);
> +#endif

On ppc64, calling jiffies_to_cputime() from sched_init() is too early because
jiffies_to_cputime() needs tb_ticks_per_sec which gets initialized only
later in time_init(). Because of this I see that cpuacct_batch will always
be zero effectively negating what this patch is trying to do.

As explained by you earlier, we too are finding the default batch value to
be too low for ppc64 with VIRT_CPU_ACCOUNTING turned on. Hence I guess
if this patch is taken in (ofcourse with the above issue fixed), it will
benefit ppc64 also.

Regards,
Bharata.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/