linux-kernel - Re: [PATCH 5/9] x86/intel_rdt: Add new cgroup and Class of service management

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.10.1508061349250.921@vshiva-Udesk>
Date:	Thu, 6 Aug 2015 13:58:39 -0700 (PDT)
From:	Vikas Shivappa <vikas.shivappa@...el.com>
To:	Tejun Heo <tj@...nel.org>
cc:	Vikas Shivappa <vikas.shivappa@...el.com>,
	Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
	linux-kernel@...r.kernel.org, x86@...nel.org, hpa@...or.com,
	tglx@...utronix.de, mingo@...nel.org, peterz@...radead.org,
	Matt Fleming <matt.fleming@...el.com>,
	"Auld, Will" <will.auld@...el.com>,
	"Williamson, Glenn P" <glenn.p.williamson@...el.com>,
	"Juvva, Kanaka D" <kanaka.d.juvva@...el.com>
Subject: Re: [PATCH 5/9] x86/intel_rdt: Add new cgroup and Class of service
 management



On Wed, 5 Aug 2015, Tejun Heo wrote:

> Hello,
>
> On Tue, Aug 04, 2015 at 07:21:52PM -0700, Vikas Shivappa wrote:
>>> I get that this would be an easier "bolt-on" solution but isn't a good
>>> solution by itself in the long term.  As I wrote multiple times
>>> before, this is a really bad programmable interface.  Unless you're
>>> sure that this doesn't have to be programmable for threads of an
>>> individual applications,
>>
>> Yes, this doesnt have to be a programmable interface for threads. May not be
>> a good idea to let the threads decide the cache allocation by themselves
>> using this direct interface. We are transfering the decision maker
>> responsibility to the system administrator.
>
> I'm having hard time believing that.  There definitely are use cases
> where cachelines are trashed among service threads.  Are you
> proclaiming that those cases aren't gonna be supported?

Please refer to the noisy neighbour example i give here to help resolve 
thrashing by a 
noisy neighbour -
http://marc.info/?l=linux-kernel&m=143889397419199

and the reference
http://www.intel.com/content/www/us/en/communications/cache-allocation-technology-white-paper.html


>
>> - This interface like you said can easily bolt-on. basically an easy to use
>> interface without worrying about the architectural details.
>
> But it's ripe with architectural details.

If specifying the bitmask is an issue , it can easily be addressed by writing a 
script which calculates the bitmask to size - like mentioned here
http://marc.info/?l=linux-kernel&m=143889397419199

  What I meant by bolt-on was
> that this is a shortcut way of introducing this feature without
> actually worrying about how this will be used by applications and
> that's not a good thing.  We need to be worrying about that.
>
>> - But still does the job. root user can allocate exclusive or overlapping
>> cache lines to threads or group of threads.
>> - No major roadblocks for usage as we can make the allocations like
>> mentioned above and still keep the hierarchy etc and use it when needed.
>> - An important factor is that it can co-exist with other interfaces like #2
>> and #3 for the same easily. So I donot see a reason why we should not use
>> this.
>> This is not meant to be a programmable interface, however it does not
>> prevent co-existence.
>
> I'm not saying they are mutually exclusive but that we're going
> overboard in this direction when programmable interface should be the
> priority.  While this mostly happened naturally for other resources
> because cgroups was introduced later but I think there's a general
> rule to follow there.

Right , the cache allocation cannot be treated like memory like explained here 
in 1.3 and 1.4
http://marc.info/?l=linux-kernel&m=143889397419199


>
>> - If root user has to set affinity of threads that he is allocating cache,
>> he can do so using other cgroups like cpuset or set the masks seperately
>> using taskset. This would let him configure the cache allocation on a
>> socket.
>
> Well, root can do whatever it wants with programmable interface too.
> The way things are designed, even containment isn't an issue, assign
> an ID to all processes by default and change the allocation on that.
>
>> this is a pretty bad interface by itself.
>>>
>>>> There is already a lot of such usage among different enterprise users at
>>>> Intel/google/cisco etc who have been testing the patches posted to lkml and
>>>> academically there is plenty of usage as well.
>>>
>>> I mean, that's the tool you gave them.  Of course they'd be using it
>>> but I suspect most of them would do fine with a programmable interface
>>> too.  Again, please think of cpu affinity.
>>
>> All the methodology to support the feature may need an arbitrator/agent to
>> decide the allocation.
>>
>> 1. Let the root user or system administrator be the one who decides the
>> allocation based on the current usage. We assume this to be one with
>> administrative privileges. He could use the cgroup interface to perform the
>> task. One way to do the cpu affinity is by mounting cpuset and rdt cgroup
>> together.
>
> If you factor in threads of a process, the above model is
> fundamentally flawed.  How would root or any external entity find out
> what threads are to be allocated what?

the process ID can be added to the cgroup together with all its threads as shown 
in example of cgroup usage in (2) here -

In most cases in the cloud you will be able to decide based on what workloads 
are running - see the example 1.5 here

http://marc.info/?l=linux-kernel&m=143889397419199


Each application would
> constnatly have to tell an external agent about what its intentions
> are.  This might seem to work in a limited feature testing setup where
> you know everything about who's doing what but is no way a widely
> deployable solution.  This pretty much degenerates into #3 you listed
> below.

App may not be the best one to decide 
1.1 and 1.2 here
http://marc.info/?l=linux-kernel&m=143889397419199

>
>> 2. Kernel automatically assigning the cache based on the priority of the apps
>> etc. This is something which could be designed to co-exist with the #1 above
>> much like how the cpusets cgroup co-exist with the kernel assigning cpus to
>> tasks. (the task could be having a cache capacity mask just like the cpu
>> affinity mask)
>
> I don't think CAT would be applicable in this manner.  BE allocation
> is what the CPU is doing by default already.  I'm highly doubtful
> something like CAT would be used automatically in generic systems.  It
> requires fairly specific coordination after all.

The 3 items were generalized at high level to show the system management vs user 
doing it or . I am not saying it should be done this way -

Thanks,
Vikas

>
>> 3. User programmable interface , where say a resource management program
>> x (and hence apps) could link a library which supports cache alloc/monitoring
>> etc and then try to control and monitor the resources. The arbitrator could just
>> be the resource management interface itself or the kernel could decide.
>>
>> If users use this programmable interface, we need to make sure all the apps
>> just cannot allocate resources without some interfacing agent (in which case
>> they could interface with #2 ?).
>>
>> Do you think there are any issues for the user programmable interface to
>> co-exist with the cgroup interface ?
>
> Isn't that a weird question to ask when there's no reason to rush to a
> full-on cgroup controller?

We can start with something simpler and
> more specific and easier for applications to program against.  If the
> hardware details make it difficult to design properly abstracted
> interface around, make it a char device node, for example, and let
> userland worry about how to control access to it.  If you stick to
> something like that, exposing most of hardware details verbatim is
> fine.  People know they're dealing with something very specific with
> those types of interfaces.
>
> Thanks.
>
> -- 
> tejun
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/