linux-kernel - Re: [PATCH 7/7] x86/intel_rdt: Add CAT documentation and usage guide

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.10.1504011116322.3891@vshiva-Udesk>
Date:	Wed, 1 Apr 2015 11:20:46 -0700 (PDT)
From:	Vikas Shivappa <vikas.shivappa@...el.com>
To:	Marcelo Tosatti <mtosatti@...hat.com>
cc:	Vikas Shivappa <vikas.shivappa@...el.com>,
	Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
	x86@...nel.org, linux-kernel@...r.kernel.org, hpa@...or.com,
	tglx@...utronix.de, mingo@...nel.org, tj@...nel.org,
	peterz@...radead.org, matt.fleming@...el.com, will.auld@...el.com,
	glenn.p.williamson@...el.com, kanaka.d.juvva@...el.com
Subject: Re: [PATCH 7/7] x86/intel_rdt: Add CAT documentation and usage
 guide



On Tue, 31 Mar 2015, Marcelo Tosatti wrote:

> On Tue, Mar 31, 2015 at 10:27:32AM -0700, Vikas Shivappa wrote:
>>
>>
>> On Thu, 26 Mar 2015, Marcelo Tosatti wrote:
>>
>>>
>>> I can't find any discussion relating to exposing the CBM interface
>>> directly to userspace in that thread ?
>>>
>>> Cpu.shares is written in ratio form, which is much more natural.
>>> Do you see any advantage in maintaining the
>>>
>>> (ratio -> cbm bitmasks)
>>>
>>> translation in userspace rather than in the kernel ?
>>>
>>> What about something like:
>>>
>>>
>>> 		      root cgroup
>>> 		   /		  \
>>> 		  /		    \
>>> 		/		      \
>>> 	cgroupA-80			cgroupB-30
>>>
>>>
>>> So that whatever exceeds 100% is the ratio of cache
>>> shared at that level (cgroup A and B share 10% of cache
>>> at that level).
>>
>> But this also means the 2 groups share all of the cache ?
>>
>> Specifying the amount of bits to be shared lets you specify the
>> exact cache area where you want to share and also when your total
>> occupancy does not cover all of the cache. For ex: it gets more
>> complex when you want to share say only the left quarter of the
>> cache. cgroupA gets left half and cgroup gets left quarter. The
>> bitmask aligns with how the h/w is designed to share the cache which
>> gives you flexibility to define any specific overlapping areas of
>> the cache.
>
>>> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/sec-cpu_and_memory-use_case.html
>>>
>>> cpu — the cpu.shares parameter determines the share of CPU resources
>>> available to each process in all cgroups. Setting the parameter to 250,
>>> 250, and 500 in the finance, sales, and engineering cgroups respectively
>>> means that processes started in these groups will split the resources
>>> with a 1:1:2 ratio. Note that when a single process is running, it
>>> consumes as much CPU as necessary no matter which cgroup it is placed
>>> in. The CPU limitation only comes into effect when two or more processes
>>> compete for CPU resources.
>>>
>>>
>>
>> These are more defined in terms of how many cache lines (or how many
>> cache ways) they can use and would be difficult to define them in
>> terms of percentage. In contrast the cpu share is a time shared
>> thing and is much more granular where as here its not , its
>> occupancy in terms of cache lines/ways.. (however this is not really
>> defined as a restriction but thats the way it is now).
>> Also note that the granularity of the bitmasks define the
>> granularity of the percentages and in some SKUs the granularity is
>> 2b and not 1b.. So technically you wont be able to even allocate
>> percentage of cache even in 10% granularity for most of the cases
>> (if there are 30MB and 25 ways like in one of hsw SKU) and this will
>> vary for different SKUs which makes it more complicated for users.
>> However the user library is free to define own interface based on
>> the underlying cgroup interface say for example you never care about
>> the overlapping and using it for a specific SKU etc.. The underlying
>> cgroup framework is meant to be  generic for all SKus and used for
>> most of the use cases.
>>
>> Also at this point I see a lot of enterprise and and other users
>> already using the cgroup interface or shown interest in the same.
>> However I see your point where you indicate the ease with which user
>> can specify in size/percentage which he might be used to doing for
>> other resources rather than bits where he needs to get an idea size
>> by calculating it seperately - But again note that you may not be
>> able to define percentages in many scenarios like the one above. And
>> another question would be we would need to convince the users to
>> adapt to the modified percentage user model (ex: like the one you
>> say above where percentage - 100 is the one thats shared)
>> I can review this requirements and others I have received and get
>> back to see the closest that can be done if possible.
>>
>> Thanks,
>> Vikas
>
> Vikas,
>
> I see. Don't have anything against performing the translation in userspace
> (i agree userspace should be able to allow ratios and specific
> minimum/maximum counts). Can you please export the relevant information
> in files in /sys or cgroups itself rather than requiring userspace to
> parse CPUID etc? Including the EBX register from CPUID(EAX=10H, ECX=1),
> which is necessary to implement "reserved LLC" properly.
>
> The current interface is unable to handle the cross CPU case, though.
> It would be necessary to expose per-socket masks.
>
>

Marcelo,

The current package supports per-socket updates to masks. Although the CLOSids 
are allocated globally just like in CMT and not per package.

The maximum bitmask is the root node's bitmask which is exposed already. The 
number of CLOSids are not exposed as kernel internally optimizes its usage and 
that should not end up giving a wrong picture for the user. For ex: if the 
number of CLOSids available is say 4 - the kernel could actually allocate them 
to more cgroups than just 4 cgroups , and this logic may change based on other 
features that my be added in the cgroup or depending on features available in 
the SKUs .. However with CAT cgroups an error is 
returned once kernel runs out of CLOSids. I am still reviewing this requirement 
with respect to the closids and will send an update soon.

Thanks,
Vikas