lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 7 Feb 2017 10:52:04 -0800
From:   "Luck, Tony" <tony.luck@...el.com>
To:     Stephane Eranian <eranian@...gle.com>
Cc:     David Carrillo-Cisneros <davidcc@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
        "Shivappa, Vikas" <vikas.shivappa@...el.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        x86 <x86@...nel.org>, "hpa@...or.com" <hpa@...or.com>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        "Shankar, Ravi V" <ravi.v.shankar@...el.com>,
        "Yu, Fenghua" <fenghua.yu@...el.com>,
        "Kleen, Andi" <andi.kleen@...el.com>,
        "Anvin, H Peter" <h.peter.anvin@...el.com>
Subject: Re: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes

On Tue, Feb 07, 2017 at 12:08:09AM -0800, Stephane Eranian wrote:
> Hi,
> 
> I wanted to take a few steps back and look at the overall goals for
> cache monitoring.
> From the various threads and discussion, my understanding is as follows.
> 
> I think the design must ensure that the following usage models can be monitored:
>    - the allocations in your CAT partitions
>    - the allocations from a task (inclusive of children tasks)
>    - the allocations from a group of tasks (inclusive of children tasks)
>    - the allocations from a CPU
>    - the allocations from a group of CPUs
> 
> All cases but first one (CAT) are natural usage. So I want to describe
> the CAT in more details.
> The goal, as I understand it, it to monitor what is going on inside
> the CAT partition to detect
> whether it saturates or if it has room to "breathe". Let's take a
> simple example.

By "natural usage" you mean "like perf(1) provides for other events"?

But we are trying to figure out requirements here ... what data do people
need to manage caches and memory bandwidth.  So from this perspective
monitoring a CAT group is a natural first choice ... did we provision
this group with too much, or too little cache.

>From that starting point I can see that a possible next step when
finding that a CAT group has too small a cache is to drill down to
find out how the tasks in the group are using cache.  Armed with that
information you could move tasks that hog too much cache (and are believed
to be streaming through memory) into a different CAT group.

What I'm not seeing is how drilling to CPUs helps you.

Say you have CPUs=CPU0,CPU1 in the CAT group and you collect data that
shows that 75% of the cache occupancy is attributed to CPU0, and only
25% to CPU1.  What can you do with this information to improve things?
If it is deemed too complex (from a kernel code perspective) to
implement per-CPU reporting how bad a loss would that be?

-Tony

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ