linux-kernel - Re: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALcN6miQVL1sEw_DG7XT=4PQBXYKW7TKEX15Qd9EQc0Pr5Qigw@mail.gmail.com>
Date:   Fri, 3 Feb 2017 13:08:05 -0800
From:   David Carrillo-Cisneros <davidcc@...gle.com>
To:     "Luck, Tony" <tony.luck@...el.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
        "Shivappa, Vikas" <vikas.shivappa@...el.com>,
        Stephane Eranian <eranian@...gle.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        x86 <x86@...nel.org>, "hpa@...or.com" <hpa@...or.com>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        "Shankar, Ravi V" <ravi.v.shankar@...el.com>,
        "Yu, Fenghua" <fenghua.yu@...el.com>,
        "Kleen, Andi" <andi.kleen@...el.com>,
        "Anvin, H Peter" <h.peter.anvin@...el.com>
Subject: Re: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes

On Fri, Feb 3, 2017 at 9:52 AM, Luck, Tony <tony.luck@...el.com> wrote:
> On Thu, Feb 02, 2017 at 06:14:05PM -0800, David Carrillo-Cisneros wrote:
>> If we tie allocation groups and monitoring groups, we are tying the
>> meaning of CPUs and we'll have to choose between the CAT meaning or
>> the perf meaning.
>>
>> Let's allow semantics that will allow perf like monitoring to
>> eventually work, even if its not immediately supported.
>
> Would it work to make monitor groups be "task list only" or "cpu mask only"
> (unlike control groups that allow mixing).

That works, but please don't use chmod. Make it explicit by the group
position (i.e. mon/cpus/grpCPU1, mon/tasks/grpTasks1).

>
> Then the intel_rdt_sched_in() code could pick the RMID in ways that
> give you the perf(1) meaning. I.e. if you create a monitor group and assign
> some CPUs to it, then we will always load the RMID for that monitor group
> when running on those cpus, regardless of what group(s) the current process
> belongs to.  But if you didn't create any cpu-only monitor groups, then we'd
> assign RMID using same rules as CLOSID (so measurements from a control group
> would track allocation policies).

I think that's very confusing for the user. A group's observed
behavior should be determined by its attributes and not change
depending on how other groups are configured. Think on multiple users
monitoring simultaneously.

>
> We are already planning that creating monitor only groups will change
> what is reported in the control group (e.g. you pull some tasks out of
> the control group to monitor them separately, so the control group only
> reports the tasks that you didn't move out for monitoring).

That's also confusing, and the work-around that Vikas proposed of two
separate files to enumerate tasks (one for control and one for
monitoring) breaks the concept of a task group.





>From our discussions, we can support the use cases we care about
without weird-corner cases, by having:
  - A set of allocation group as stand now. Either use the current
resctrl, or rename it to something like resdir/ctrl (before v4.10
sails).
  - A set of monitoring task groups. Either in a "tasks" folder in a
resmon fs  or in resdir/mon/tasks.
  - A set of monitoring CPU groups. Either in a "cpus" folder in a
resmon fs  or in resdir/mon/cpus.

So when a user measures a group (shown using the -G option, it could
as well be the -R Vikas wants):

1) perf stat -e llc_occupancy -G resdir/ctrl/g1
measures the CAT allocation group as if RMIDs were managed like CLOSIDs.

2) perf stat -e llc_occupancy -G resdir/mon/tasks/g1
measures the combined occupancy of all tasks in g1 (like a cgroups in
present perf).

3) perf stat -e llc_occupancy -C <some id of resdir/mon/cpus/g1>
*XOR* perf stat -e llc_occupancy -G resdir/mon/cpus/g1
measures the combined occupancy of all tasks while ran in any CPU in
g1 (perf-like filtering, not the CAT way).

I know the present implementation scope is limited, so you could:
  - support 1) and/or 2) only
  - do a simple RMID management (e.g. same RMID all packages, allocate
RMID on creation or fail)
  - do the custom fs based tool that Vikas mentioned instead of using
perf_event_open (if it's somehow easier to build and maintain a new
tool rather than reuse perf(1) ).

any or all of the above are fine. But please don't choose group
semantics that will prevent us from eventually supporting full
perf-like behavior or that we already know explode in user's face.

Thanks,
David