linux-kernel - Re: [PATCH 1/1] sched: Make schedstats a runtime tunable that is disabled by default v4

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160203113911.GP8337@techsingularity.net>
Date:	Wed, 3 Feb 2016 11:39:11 +0000
From:	Mel Gorman <mgorman@...hsingularity.net>
To:	Ingo Molnar <mingo@...nel.org>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Matt Fleming <matt@...eblueprint.co.uk>,
	Mike Galbraith <mgalbraith@...e.de>,
	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/1] sched: Make schedstats a runtime tunable that is
 disabled by default v4

On Wed, Feb 03, 2016 at 12:28:49PM +0100, Ingo Molnar wrote:
> 
> * Mel Gorman <mgorman@...hsingularity.net> wrote:
> 
> > Changelog since v3
> > o Force enable stats during profiling and latencytop
> > 
> > Changelog since V2
> > o Print stats that are not related to schedstat
> > o Reintroduce a static inline for update_stats_dequeue
> > 
> > Changelog since V1
> > o Introduce schedstat_enabled and address Ingo's feedback
> > o More schedstat-only paths eliminated, particularly ttwu_stat
> > 
> > schedstats is very useful during debugging and performance tuning but it
> > incurs overhead. As such, even though it can be disabled at build time,
> > it is often enabled as the information is useful.  This patch adds a
> > kernel command-line and sysctl tunable to enable or disable schedstats on
> > demand. It is disabled by default as someone who knows they need it can
> > also learn to enable it when necessary.
> > 
> > The benefits are workload-dependent but when it gets down to it, the
> > difference will be whether cache misses are incurred updating the shared
> > stats or not. [...]
> 
> Hm, which shared stats are those?

Extremely poor phrasing on my part. The stats share a cache line and the
impact partially depends on whether unrelated stats share a cache line or
not during updates.

> I think we should really fix those as well: 
> those shared stats should be percpu collected as well, with no extra cache misses 
> in any scheduler fast path.
> 

I looked into that but converting those stats to per-cpu counters would
incur sizable memory overhead. There are a *lot* of them and the basic
structure for the generic percpu-counter is

struct percpu_counter {
        raw_spinlock_t lock;
        s64 count;
#ifdef CONFIG_HOTPLUG_CPU
        struct list_head list;  /* All percpu_counters are on a list */
#endif
        s32 __percpu *counters;
};

That's not taking the associated runtime overhead such as synchronising
them. Granted, some specialised implementation could be done for scheduler
but it would be massive overkill and maintenance overhead for stats that
most users do not even want.

-- 
Mel Gorman
SUSE Labs