linux-kernel - Re: CFS Bandwidth Control - Test results of cgroups tasks pinned vs unpinnede

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110913174150.GA3062@linux.vnet.ibm.com>
Date:	Tue, 13 Sep 2011 23:11:50 +0530
From:	Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	Paul Turner <pjt@...gle.com>,
	Kamalesh Babulal <kamalesh@...ux.vnet.ibm.com>,
	Vladimir Davydov <vdavydov@...allels.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Bharata B Rao <bharata@...ux.vnet.ibm.com>,
	Dhaval Giani <dhaval.giani@...il.com>,
	Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...e.hu>,
	Pavel Emelianov <xemul@...allels.com>
Subject: Re: CFS Bandwidth Control - Test results of cgroups tasks pinned vs
 unpinnede

* Peter Zijlstra <a.p.zijlstra@...llo.nl> [2011-09-13 18:33:09]:

> On Tue, 2011-09-13 at 21:51 +0530, Srivatsa Vaddagiri wrote:
> > > which increases the time you force a task to sleep that's holding locks etc..
> > 
> > Ideally all tasks should get capped at the same time, given that there is
> > a global pool from which everyone pulls bandwidth? So while one vcpu/task
> > (holding a lock) gets capped, other vcpus/tasks (that may want the same lock)
> > should ideally not be running for long after that, avoiding lock inversion
> > related problems you point out.
> 
> No this simply cannot be true.. You force groups to sleep so that other
> groups can run, right? Therefore shared kernel locks will cause
> inversion.

Ah ..shared locks of "host" kernel ..true ..that can still cause
lock-inversion yes.

I had in mind user-space (or "guest" kernel) locks - which can't get inverted 
that easily (one of cgroup's tasks wanting a "userspace" lock which is held by 
another "throttled" task of same cgroup - causing a inversion problem of sorts).
My point was that once a task gets throttled, other sibling tasks should get 
throttled almost immediately after that (given that bandwidth for a cgroup is 
maintained in a global pool from which everyone draws in "small" increments) - 
so a task that gets capped while holding a user-space lock should not
result in other sibling tasks going too much hungry on held locks within the
same period?

> You cannot put both groups to sleep and still expect a utilization of
> 100%.
> 
> Simple example, some task in group A owns the i_mutex of a file, group A
> runs out of time and gets dequeued. Some other task in group B needs
> that same i_mutex.
> 
> > I guess that we may still run into that with current implementation ..
> > Basically global pool may have zero runtime left for current period,
> > forcing a vcpu/task to be throttled, while there is surplus runtime in
> > per-cpu pools, allowing some sibling vcpus/tasks to run for wee bit
> > more, leading to lock-inversion related problems (more idling). That
> > makes me think we can improve directed yield->capping interaction.
> > Essentially when the target task of directed yield is capped, can the
> > "yielding" task donate some of its bandwidth? 
> 
> What moron ever calls yield anyway?

I meant directed yield (yield_to) ..which is used by KVM when it detects 
pause-loops. Essentially, a vcpu spinning in guest-kernel context for too long 
leading to PLE (Pasue-Loop-Exit), which leads to KVM driver doing a directed 
yield to another sibling vcpu ..so the target of directed yield may be a
capped vcpu task, in which case was wondering if directed yield can donate
bit of bandwidth to the throttled task. Again going by what I said earlier about
tasks getting capped more or less at same time, this should occur very 
infrequently ...something for me to test and find out nevertheless!

> If you use yield you're doing it wrong!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/