lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 2 Nov 2015 20:20:58 -0200
From:	Marcelo Tosatti <mtosatti@...hat.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Fenghua Yu <fenghua.yu@...el.com>,
	H Peter Anvin <hpa@...or.com>,
	Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
	x86 <x86@...nel.org>,
	Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
	Karen Noel <knoel@...hat.com>,
	Paolo Bonzini <pbonzini@...hat.com>,
	Luiz Capitulino <lcapitulino@...hat.com>,
	Clark Williams <williams@...hat.com>, Tejun Heo <tj@...nel.org>
Subject: cat cgroup interface proposal (non hierarchical) was Re: [PATCH V15
 00/11] x86: Intel Cache Allocation Technology Support

On Fri, Oct 16, 2015 at 11:50:22AM +0200, Peter Zijlstra wrote:
> On Thu, Oct 15, 2015 at 11:28:52PM -0300, Marcelo Tosatti wrote:
> > On Thu, Oct 15, 2015 at 01:36:14PM +0200, Peter Zijlstra wrote:
> > > On Tue, Oct 13, 2015 at 06:31:27PM -0300, Marcelo Tosatti wrote:
> > > > I am rewriting the interface with ioctls, with commands similar to the
> > > > syscall interface proposed.
> > > 
> > > Which is horrible for other use cases. I really don't see the problem
> > > with the cgroup stuff.
> > 
> > Can you detail what "horrible" means? 
> 
> Say an RT scenario; you set up your machine with cgroups. You create a
> cpuset with is disjoint from the others, you frob around with the cpu
> cgroup, etc..
> 
> So once you're all done, you start your RT app into a cgroup.
> 
> But oh, fail, now you have to go muck about with ioctl()s to get the
> cache allocation cruft to work.

Peter, what follows is your cgroup proposal (extended), but 
at the end there is a point about impossibility of this cgroup 
interface to share cache between tasks, which IMO renders it unuseable
(as it blocks any threads from sharing reserved cache).

If you have any ideas on how to circumvent this, they are appreciated.

Follows non-hierarchical cgroup CAT interface proposal. Thanks to some of the
CC'ed Red Hat folks for early comments.

cgroup CAT interface (non hierarchical):
---------------------------------------

0) Main directory files:

cat_hw_info
-----------
CAT HW information: CBM length, CDP supported, etc.
Information reported per-socket, as sockets can have
different configurations. Perhaps should be inside
sysfs.

1) Sub-directories represent cache reservations (size,type).

mkdir cache_reservation_for_forwarding_app
cd cache_reservation_for_forwarding_app
echo "80KB" > size
echo "data_and_code" > type
echo "socketmask=0xfffff" > socketmask (optional)
echo "1" > enable_reservation
echo "pid-of-forwarding-main-thread pid-of-helper-thread ..." > tasks

Files:

type
----------------
{data_and_code, only_code, only_data}. Type of
L3 CAT cache allocation to use. only_code,only_data only
supported on CDP capable processors.

size
----
size of L3 cache reservation.

rounding
--------
{round_down,round_up} whether to round up / round down
allocation size in kbytes, to cache-way size.

Default: round_up

socketmask
----------
Mask of sockets where the reservation is in effect.
A zero bit means the task will not have the L3 cache
portion that the cgroup references reserved on that socket.
Default: all sockets set.

enable
------
Allocate reservation with parameters set above.

When a reservation is enabled, it reserves L3 cache
space on any socket thats specified in "socketmask".

After cgroup has been enabled by a write of "1" to
"enable_reservation" file, only the "tasks" file can be modified.
To change the size of a cgroup reservation, recreate the directory.

tasks
-----

Contains the list of tasks which use this cache reservation.

Error reporting
---------------

Errors are reported in response to write as appropriate:
for example, write 1 > enable when there is not enough space
for "socketmask" would return -ENOSPC, etc.
Write to "enable" without size being set would return -EINVAL, etc.

Listing
-------
To list which reservations are in place, search for subdirectories
where "enabled" file has value 1.

Semantics: A task has guaranteed cache reservation on any CPU where its
scheduled in, for the lifetime of the cgroup, as long as that task is
not attached to further cgroups.

That is, a task belonging to cgroup-A can have its cache reservation
invalidated when attached to cgroup-B, (reasoning: it might be necessary
to reallocate the CBMs to respect contiguous bits in cache, a
restriction of the CAT HW interface).


-------
BLOCKER
-------

Can't use cgroups for CAT because:

"Control Groups extends the kernel as follows:

 - Each task in the system has a reference-counted pointer to a
   css_set.

 - A css_set contains a set of reference-counted pointers to
   cgroup_subsys_state objects, one for each cgroup subsystem
   registered in the system."

You need a task to be part of two cgroups at one time, 
to support the following configuration:

Task-A: 70% of cache reserved exclusively (reservation-0).
        20% of cache reserved (reservation-1).

Task-B: 20% of cache reserved (reservation-1).

Unless reservations are created separately, then added to cgroups:

mount -t cgroup ... /../catcgroup/
cd /../catcgroup/
# create reservations
cd reservations
mkdir reservation-1
echo "80K" > size
echo "socketmask" > ...
echo "1" > enable
mkdir reservation-2 
echo "160K" > size
echo "socketmask" > ...
echo "1" > enable
# attach reservation to cgroups
cd /../catcgroup/
mkdir cgroup-for-threaded-app
echo reservation-1 reservation-2 > reservations
echo "mainthread" > tasks
cd ..
mkdir cgroup-for-helper-thread
echo reservation-1 > reservations
echo "helperthread" > tasks
cd .. 

This way mainthread and helperthread can share "reservation-1".

But this is abusing cgroups in a way that it has not been designed for.
Who is going to maintain the linkage between reservations and cgroups?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ