lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 8 May 2012 14:07:40 +0100
From:	"Daniel P. Berrange" <berrange@...hat.com>
To:	Nishanth Aravamudan <nacc@...ux.vnet.ibm.com>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>,
	mingo@...nel.org, pjt@...gle.com, paul@...lmenage.org,
	akpm@...ux-foundation.org, rjw@...k.pl, nacc@...ibm.com,
	paulmck@...ux.vnet.ibm.com, tglx@...utronix.de,
	seto.hidetoshi@...fujitsu.com, rob@...dley.net, tj@...nel.org,
	mschmidt@...hat.com, nikunj@...ux.vnet.ibm.com,
	vatsa@...ux.vnet.ibm.com, linux-kernel@...r.kernel.org,
	linux-doc@...r.kernel.org, linux-pm@...r.kernel.org
Subject: Re: [PATCH v2 0/7] CPU hotplug, cpusets: Fix issues with cpusets
 handling upon CPU hotplug

On Fri, May 04, 2012 at 02:30:11PM -0700, Nishanth Aravamudan wrote:
> On 04.05.2012 [22:56:21 +0200], Peter Zijlstra wrote:
> > On Fri, 2012-05-04 at 13:46 -0700, Nishanth Aravamudan wrote:
> > > What about other users of cpusets (what are they?)? 
> > 
> > cpusets came from SGI, its traditionally used to partition _large_
> > machines. Things like the batch/job-schedulers that go with that type of
> > setup use it.
> 
> Yeah, I recall that usage (or some description similar). Do we have any
> other known users of cpusets (beyond libvirt)?

IIRC, the lxc.sf.net project also uses cpusets (no connection to the libvirt
LXC driver mentioned below which is an alternative impl of the same concept).

> > I've no clue why libvirt uses it (or why one would use libvirt for that
> > matter).
> 
> Well, it is the case that libvirt does use it, and libvirt is used
> pretty widely (or so it seems to me). I don't use it (cpusets or libvirt
> :) either, but it seems like we should either tell libvirt directly that
> cpusets are inappropriate for their use-case (once we figure out what
> exactly that is, and why they chose cpusets) or work with them to
> support their use-case?

Libvirt uses the cpuset cgroups functionality in two of its
virtualization drivers:

 - LXC.  Container based virt. The cpuset controller is used to
   constrain all processes running inside the container to a
   specific collection of CPUs. While we could use the traditional
   sched_setaffinity() syscall at initial startup of the container,
   this is not so practical when we want to dynamically change the
   affinity of an existing container. It would require that we
   iterate over all tasks changing their affinity, and to avoid
   fork() race conditions we'd need to suspend the container while
   doing this. Thus we've long used the cpuset cgroups controller
   for LXC.

 - KVM.  Full machine virt. By default we use sched_setaffinity
   to apply constraints on what host CPUs a VM executes on. Fairly
   recently we added the ability to optionally use the cpuset
   controller instead (only if the sysadmin has already mounted
   it). The advantage of this, is that if we update the cpuset
   of an existing VM, then IIUC, the kernel will migrate its
   allocated memory to be local to the new CPU set mask.

The pain point we're hitting, is that upon suspend/restore the cgroups
cpuset masks are not preserved. This is not a problem for server virt
usage scenarios, but it is for desktop users with virt on laptaops.

I don't see a viable alternative to the cpuset controller for our LXC
container driver. For KVM we could do without the cpuset controller
if there is alternative way to tell the kernel to migrate the KVM
process memory to be local to the new CPU affinity set using the
sched_setaffinity() call.

We are open to suggestions of alternative approaches, particularly since
we have had no end of trouble with pretty much all of the kernel's
cgroups controllers :-(

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ