linux-kernel - Re: [RFC PATCH 00/20] Intel(R) Resource Director Technology Cache Pseudo-Locking enabling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <93415e33-6adf-047f-9a46-0862c3cd33b6@intel.com>
Date:   Fri, 17 Nov 2017 22:42:13 -0800
From:   Reinette Chatre <reinette.chatre@...el.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     fenghua.yu@...el.com, tony.luck@...el.com,
        vikas.shivappa@...ux.intel.com, dave.hansen@...el.com,
        mingo@...hat.com, hpa@...or.com, x86@...nel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 00/20] Intel(R) Resource Director Technology Cache
 Pseudo-Locking enabling

Hi Thomas,

On 11/17/2017 4:48 PM, Thomas Gleixner wrote:
> On Mon, 13 Nov 2017, Reinette Chatre wrote:
> 
> thanks for that interesting work. Before I start looking into the details
> in the next days let me ask a few general questions first.

Thank you very much for taking a look. I look forward to your feedback.

> 
>> Cache Allocation Technology (CAT), part of Intel(R) Resource Director
>> Technology (Intel(R) RDT), enables a user to specify the amount of cache
>> space into which an application can fill. Cache pseudo-locking builds on
>> the fact that a CPU can still read and write data pre-allocated outside
>> its current allocated area on cache hit. With cache pseudo-locking data
>> can be preloaded into a reserved portion of cache that no application can
>> fill, and from that point on will only serve cache hits. The cache
>> pseudo-locked memory is made accessible to user space where an application
>> can map it into its virtual address space and thus have a region of
>> memory with reduced average read latency.
> 
> Did you compare that against the good old cache coloring mechanism,
> e.g. palloc ?

I understand where your question originates. I have not compared against PALLOC for two reasons:
1) PALLOC is not upstream and while inquiring about the status of this work (please see https://github.com/heechul/palloc/issues/4 for details) we learned that one reason for this is that recent Intel processors are not well supported.
2) The most recent kernel supported by PALLOC is v4.4 and also mentioned in the above link there is currently no plan to upstream this work for a less divergent comparison of PALLOC and the more recent RDT/CAT enabling on which Cache Pseudo-Locking is built.

>> The cache pseudo-locking approach relies on generation-specific behavior
>> of processors. It may provide benefits on certain processor generations,
>> but is not guaranteed to be supported in the future.
> 
> Hmm, are you saying that the CAT mechanism might change radically in the
> future so that access to cached data in an allocated area which does not
> belong to the current executing context wont work anymore?

Most devices that publicly support CAT in the Linux mainline can take advantage of Cache Pseudo-Locking.  However, Cache Pseudo-Locking is a model-specific feature so there may be some variation in if, or to what extent, current and future devices can support Cache Pseudo-Locking. CAT remains architectural.

>> It is not a guarantee that data will remain in the cache. It is not a
>> guarantee that data will remain in certain levels or certain regions of
>> the cache. Rather, cache pseudo-locking increases the probability that
>> data will remain in a certain level of the cache via carefully
>> configuring the CAT feature and carefully controlling application
>> behavior.
> 
> Which kind of applications are you targeting with that?
>
> Are there real world use cases which actually can benefit from this and

To ensure I answer your question I will consider two views. First, the "carefully controlling application behavior" referred to above refers to applications/OS/VMs running after the pseudo-locked regions have been set up. These applications should take care to not do anything, for example call wbinvd, that would affect the Cache Pseudo-Locked regions. Second, what you are also asking about is the applications using these Cache Pseudo-Locked regions. We do see a clear performance benefit to applications using these pseudo-locked regions. Latency sensitive applications could relocate their code as well as data to pseudo-locked regions for improved performance.

> what are those applications supposed to do once the feature breaks with
> future generations of processors?

This feature is model specific with a few platforms supporting it at this time. Only platforms known to support Cache Pseudo-Locking will expose its resctrl interface.

Reinette