linux-kernel - Re: CFS Bandwidth Control - Test results of cgroups tasks pinned vs unpinnede

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110913175425.GB3062@linux.vnet.ibm.com>
Date:	Tue, 13 Sep 2011 23:24:25 +0530
From:	Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	Paul Turner <pjt@...gle.com>,
	Kamalesh Babulal <kamalesh@...ux.vnet.ibm.com>,
	Vladimir Davydov <vdavydov@...allels.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Bharata B Rao <bharata@...ux.vnet.ibm.com>,
	Dhaval Giani <dhaval.giani@...il.com>,
	Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...e.hu>,
	Pavel Emelianov <xemul@...allels.com>
Subject: Re: CFS Bandwidth Control - Test results of cgroups tasks pinned vs
 unpinnede

* Peter Zijlstra <a.p.zijlstra@...llo.nl> [2011-09-13 18:36:15]:
> > Value of sched_cfs_bandwidth_slice_us was reduced from default of 5000us
> > to 500us, which (along with reduction of min/max interval) helped cut down
> > idle time further (3.9% -> 2.7%). I was commenting that this may not necessarily
> > be optimal (as for example low 'sched_cfs_bandwidth_slice_us' could result
> > in all cpus contending for cfs_b->lock very frequently). 
> 
> Right.. so this seems to suggest you're migrating a lot.

We did do some experiments (outside of capping) to see how badly tasks
migrate on latest tip (compared to previous kernels). The test was to
spawn 32 cpuhogs on a 16-cpu system (place them in default cgroup -
without any capping in place) and measure how much they bounce around.
System had little load besides these cpu hogs.

We saw considerably high migration count on latest tip compared to
previous kernels. Kamalesh, can you please post the migration count
data?

> Also what workload are we talking about? the insane one with 5 groups of
> weight 1024?

We never were running the "insane" one ..we are always with proportional
shares, the "sane" one! I missed to mention that bit in my first email
(about the shares setup). I am attaching the test script we are using
for your reference. Fyi, we have added additional levels to cgroup setup
(/Level1/Level2/C1/C1_1 etc) to mimic cgroup hierarchy for VMS as
created by libvirt.

> Ramping up the frequency of the load-balancer and giving out smaller
> slices is really anti-scalability.. I bet a lot of that 'reclaimed' idle
> time is spend in system time. 

System time (in top and vmstat) does remain unchanged at 0% when
cranking up load-balance frequency and slicing down
sched_cfs_bandwidth_slice_us ..I guess the additional "system" time
can't be accounted for easily by the tick-based accounting system we
have. I agree there could be other un-observed side-effects of increased
load-balance frequency (like workload performance) that I haven't noticed. 

- vatsa

Download attachment "hard_limit_test.sh" of type "application/x-sh" (6697 bytes)