lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 15 Jul 2014 18:37:00 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Tejun Heo <tj@...nel.org>
Cc:	Tim Chen <tim.c.chen@...ux.intel.com>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	"H. Peter Anvin" <hpa@...or.com>,
	"David S.Miller" <davem@...emloft.net>,
	Ingo Molnar <mingo@...nel.org>,
	Chandramouli Narayanan <mouli@...ux.intel.com>,
	Vinodh Gopal <vinodh.gopal@...el.com>,
	James Guilford <james.guilford@...el.com>,
	Wajdi Feghali <wajdi.k.feghali@...el.com>,
	Jussi Kivilinna <jussi.kivilinna@....fi>,
	linux-crypto@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 6/7] sched: add function nr_running_cpu to expose
 number of tasks running on cpu

On Tue, Jul 15, 2014 at 11:21:49AM -0400, Tejun Heo wrote:
> On Tue, Jul 15, 2014 at 03:36:27PM +0200, Peter Zijlstra wrote:
> > So, just to expand on this, we're already getting 'bug' reports because
> > worker threads are not cgroup aware. If work gets generated inside some
> > cgroup, the worker doesn't care and runs the worker thread wherever
> > (typically the root cgroup).
> > 
> > This means that the 'work' escapes the cgroup confines and creates
> > resource inversion etc. The same is of course true for nice and RT
> > priorities.
> > 
> > TJ, are you aware of this and/or given it any throught?
> 
> Yeap, I'm aware of the issue but haven't read any actual bug reports
> yet.  Can you point me to the reports?

lkml.kernel.org/r/53A8EC1E.1060504@...ux.vnet.ibm.com

The root level workqueue thingies disturb the cgroup level scheduling to
'some' extend.

That whole thread is somewhat confusing and I think there's more than
just this going on, but they're really seeing this as a pain point.

> Given that worker pool management is dynamic, spawning separate pools
> for individual cgroups on-demand should be doable.  Haven't been able
> to decide how much we should be willing to pay in terms of complexity
> yet.

Yah, I figured. Back before you ripped up the workqueue I had a
worklet-PI patch in -rt, which basically sorted and ran works in a
RR/FIFO priority order, including boosting the current work when a
higher prio one was pending etc.

I never really figured out a way to make the new concurrent stuff do
something like that, and this 'problem' here is harder still, because
they're not static prios etc.

Ideally we'd run the works _in_ the same task-context (from a scheduler
POV) as the task creating the work. There's some very obvious problems
of implementation there, and some less obvious others, so bleh.

Also, there's the whole softirq trainwreck, which has many of the same
problems. Much of the network stack isn't necessarily aware for whom
they're doing work, so no way to propagate.

Point in case for the crypto stuff I suppose, that's a combination of
the two, god only knows who we should be accounting it to and in what
context things should run.

Ideally a socket has a 'single' (ha! if only) owner, and we'd know
throughout the entirely rx/tx paths, but I doubt we actually have that.

(Note that there's people really suffering because of this..)

Same for the 'shiny' block-mq stuff I suppose :-(

Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ