linux-kernel - Re: [CPUISOL] CPU isolation extensions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1201551730.28547.54.camel@lappy>
Date:	Mon, 28 Jan 2008 21:22:09 +0100
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Steven Rostedt <rostedt@...dmis.org>
Cc:	Max Krasnyanskiy <maxk@...lcomm.com>, Paul Jackson <pj@....com>,
	LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...e.hu>,
	Gregory Haskins <ghaskins@...ell.com>
Subject: Re: [CPUISOL] CPU isolation extensions


On Mon, 2008-01-28 at 14:00 -0500, Steven Rostedt wrote:
> 
> On Mon, 28 Jan 2008, Max Krasnyanskiy wrote:
> > >>   [PATCH] [CPUISOL] Support for workqueue isolation
> > >
> > > The thing about workqueues is that they should only be woken on a CPU if
> > > something on that CPU accessed them. IOW, the workqueue on a CPU handles
> > > work that was called by something on that CPU. Which means that
> > > something that high prio task did triggered a workqueue to do some work.
> > > But this can also be triggered by interrupts, so by keeping interrupts
> > > off the CPU no workqueue should be activated.
> 
> > No no no. That's what I though too ;-). The problem is that things like NFS and friends
> > expect _all_ their workqueue threads to report back when they do certain things like
> > flushing buffers and stuff. The reason I added this is because my machines were getting
> > stuck because CPU0 was waiting for CPU1 to run NFS work queue threads even though no IRQs
> > or other things are running on it.
> 
> This sounds more like we should fix NFS than add this for all workqueues.
> Again, we want workqueues to run on the behalf of whatever is running on
> that CPU, including those tasks that are running on an isolcpu.

agreed, by looking at my top output (and not the nfs code) it looks like
it just spawns a configurable number of active kernel threads which are
not cpu bound by in any way. I think just removing the isolated cpus
from their runnable mask should take care of them.

> 
> >
> > >>   [PATCH] [CPUISOL] Isolated CPUs should be ignored by the "stop machine"
> > >
> > > This I find very dangerous. We are making an assumption that tasks on an
> > > isolated CPU wont be doing things that stopmachine requires. What stops
> > > a task on an isolated CPU from calling something into the kernel that
> > > stop_machine requires to halt?
> 
> > I agree in general. The thing is though that stop machine just kills any kind of latency
> > guaranties. Without the patch the machine just hangs waiting for the stop-machine to run
> > when module is inserted/removed. And running without dynamic module loading is not very
> > practical on general purpose machines. So I'd rather have an option with a big red warning
> > than no option at all :).
> 
> Well, that's something one of the greater powers (Linus, Andrew, Ingo)
> must decide. ;-)

I'm in favour of better engineered method, that is, we really should try
to solve these problems in a proper way. Hacks like this might be fine
for custom kernels, but I think we should have a higher standard when it
comes to upstream - we all have to live many years with whatever we put
in there, we'd better think well about it.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/