linux-kernel - Re: [PATCH 1/1, v7] cgroup/freezer: add per freezer duty ratio control

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20110215111857.47907dc5.kamezawa.hiroyu@jp.fujitsu.com>
Date:	Tue, 15 Feb 2011 11:18:57 +0900
From:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Arjan van de Ven <arjan@...ux.intel.com>,
	Matt Helsley <matthltc@...ibm.com>,
	jacob.jun.pan@...ux.intel.com, LKML <linux-kernel@...r.kernel.org>,
	"Kirill A. Shutemov" <kirill@...temov.name>,
	container cgroup <containers@...ts.linux-foundation.org>,
	Li Zefan <lizf@...fujitsu.com>,
	Paul Menage <menage@...gle.com>, rdunlap@...otime.net,
	Cedric Le Goater <clg@...t.ibm.com>
Subject: Re: [PATCH 1/1, v7] cgroup/freezer: add per freezer duty ratio
 control

On Mon, 14 Feb 2011 15:07:30 -0800
Andrew Morton <akpm@...ux-foundation.org> wrote:

> On Sun, 13 Feb 2011 19:23:10 -0800
> Arjan van de Ven <arjan@...ux.intel.com> wrote:
> 
> > On 2/13/2011 4:44 PM, KAMEZAWA Hiroyuki wrote:
> > > On Sat, 12 Feb 2011 15:29:07 -0800
> > > Matt Helsley<matthltc@...ibm.com>  wrote:
> > >
> > >> On Fri, Feb 11, 2011 at 11:10:44AM -0800, jacob.jun.pan@...ux.intel.com wrote:
> > >>> From: Jacob Pan<jacob.jun.pan@...ux.intel.com>
> > >>>
> > >>> Freezer subsystem is used to manage batch jobs which can start
> > >>> stop at the same time. However, sometime it is desirable to let
> > >>> the kernel manage the freezer state automatically with a given
> > >>> duty ratio.
> > >>> For example, if we want to reduce the time that backgroup apps
> > >>> are allowed to run we can put them into a freezer subsystem and
> > >>> set the kernel to turn them THAWED/FROZEN at given duty ratio.
> > >>>
> > >>> This patch introduces two file nodes under cgroup
> > >>> freezer.duty_ratio_pct and freezer.period_sec
> > >> Again: I don't think this is the right approach in the long term.
> > >> It would be better not to add this interface and instead enable the
> > >> cpu cgroup subsystem for non-rt tasks using a similar duty ratio
> > >> concept..
> > >>
> > >> Nevertheless, I've added some feedback on the code for you here :).
> > >>
> > > AFAIK, there was a work for bandwidth control in CFS.
> > >
> > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2010-10/msg04335.html
> > >
> > > I tested this and worked fine. This schduler approach seems better for
> > > my purpose to limit bandwidth of apprications rather than freezer.
> > 
> > for our purpose, it's not about bandwidth.
> > it's about making sure the class of apps don't run for a long period 
> > (30-second range) of time.
> > 
> 
> The discussion about this patchset seems to have been upside-down: lots
> of talk about a particular implementation, with people walking back
> from the implemetnation trying to work out what the requirements were,
> then seeing if other implementations might suit those requirements. 
> Whatever they were.
> 
> I think it would be helpful to start again, ignoring (for now) any
> implementation.
> 
> 
> What are the requirements here, guys?  What effects are we actually
> trying to achieve?  Once that is understood and agreed to, we can
> think about implementations.
> 
> 
> And maybe you people _are_ clear about the requirements.  But I'm not and
> I'm sure many others aren't too.  A clear statement of them would help
> things along and would doubtless lead to better code.  This is pretty
> basic stuff!
> 

Ok, my(our) reuquirement is mostly 2 requirements.

- control batch jobs.
- control kvm and limit usage of cpu.

Considering kvm, we need to allow putting intaractive jobs and
batch jobs onto a cpu. This will be difference in requirements.
We need some latency sensitive control and static guarantee in peformance
limit. For example, when a user limits a process to use 50% of cpu.
Checks cpu usage by 'top -d 1', and should see almost '50%' value.


IIUC, freezer is like a system to deliver SIGSTOP. set tasks as
TASK_UNINTERRUPTIBLE and make them sleep. This check is done at
places usual signal-check and some hooks in kernel threads.
This means the subsystem checks all threads one by one and set flags,
make them TASK_UNINTERRUPTIBLE finally when them wakes up.
So, sleep/wakeup cost depeneds on the number of tasks and a task may
not be freezable until it finds hooks of try_to_freeze().

I hear when using FUSE, a task may never freeze if a process for FUSE operation
is freezed before it freezes. This sounds freezer cgroup is not easy to use.

CFS+bandwidh is a scheduler.
It removes a sub scheduler entity from a tree when it exceeds allowed time
slice. The cost of calculation of allowed time slice is involved in scheduler
but I think it will not be too heavy. (Because MAINTAINERS will see what's
going on and they are sensitive to the cost.)
Tasks are all RUNNABLE. A task in group releases cpu when it see
'reschedule' flag. We have plenty of hooks of cond_resched(). (And we know
we tries to change spin_lock to mutex if spin_lock is huge cost)

This will show a good result of perofmance even with 'top -d 1'. We'll not see
TASK_RUNNING <-> TASK_INTERRUPTIBLE status change. And I think
we can make period of time slice smaller than using freezer for better latency.


Thanks,
-Kame





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/