lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 2 Aug 2018 09:14:10 -0700
From:   Reinette Chatre <reinette.chatre@...el.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     tglx@...utronix.de, mingo@...hat.com, fenghua.yu@...el.com,
        tony.luck@...el.com, vikas.shivappa@...ux.intel.com,
        gavin.hindman@...el.com, jithu.joseph@...el.com,
        dave.hansen@...el.com, hpa@...or.com, x86@...nel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/2] x86/intel_rdt and perf/x86: Fix lack of coordination
 with perf

Hi Peter,

On 8/2/2018 5:39 AM, Peter Zijlstra wrote:
> On Tue, Jul 31, 2018 at 12:38:27PM -0700, Reinette Chatre wrote:
>> Dear Maintainers,
>>
>> The success of Cache Pseudo-Locking can be measured via the use of
>> performance events. Specifically, the number of cache hits and misses
>> reading a memory region after it has been pseudo-locked to cache. This
>> measurement is triggered via the resctrl debugfs interface.
>>
>> To ensure most accurate results the performance counters and their
>> configuration registers are accessed directly.
> 
> NAK on that.
> 

After data is locked to cache we need to measure the success of that.
There is no instruction that we can use to query if a memory address has
been cached but we can use performance monitoring events that are
especially valuable on the platforms where they are precise event capable.

To ensure that we are only measuring the presence of data that should be
locked to cache we need to tightly control how this measurement is done.

For example, on my test system I locked 256KB to the cache and with the
current implementation (tip.git on branch x86/cache) I am able to
accurately measure that this was successful as seen below (each cache
line within the 256KB is accessed while the performance monitoring
events are active):

pseudo_lock_mea-26090 [002] .... 61838.488027: pseudo_lock_l2: hits=4096
miss=0
pseudo_lock_mea-26097 [002] .... 61843.689381: pseudo_lock_l2: hits=4096
miss=0
pseudo_lock_mea-26100 [002] .... 61848.751411: pseudo_lock_l2: hits=4096
miss=0
pseudo_lock_mea-26108 [002] .... 61853.820361: pseudo_lock_l2: hits=4096
miss=0
pseudo_lock_mea-26111 [002] .... 61858.880364: pseudo_lock_l2: hits=4096
miss=0
pseudo_lock_mea-26118 [002] .... 61863.937343: pseudo_lock_l2: hits=4096
miss=0
pseudo_lock_mea-26121 [002] .... 61869.008341: pseudo_lock_l2: hits=4096
miss=0

The current implementation does not coordinate with perf and this is
what I am trying to fix in this series.

I do respect your NAK but it is not clear to me how to proceed after
obtaining it. Could you please elaborate on what you would prefer as a
solution to ensure accurate measurement of cache-locked data that is
better integrated?

Thank you very much

Reinette


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ