linux-kernel - Re: [PATCH 01/14] x86/cqm: Intel Resource Monitoring Documentation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.10.1612231126590.32409@vshiva-Udesk>
Date:   Fri, 23 Dec 2016 11:35:03 -0800 (PST)
From:   Shivappa Vikas <vikas.shivappa@...el.com>
To:     Peter Zijlstra <peterz@...radead.org>
cc:     Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
        vikas.shivappa@...el.com, linux-kernel@...r.kernel.org,
        x86@...nel.org, tglx@...utronix.de, ravi.v.shankar@...el.com,
        tony.luck@...el.com, fenghua.yu@...el.com, andi.kleen@...el.com,
        davidcc@...gle.com, eranian@...gle.com, hpa@...or.com
Subject: Re: [PATCH 01/14] x86/cqm: Intel Resource Monitoring Documentation


Hello Peterz,

On Fri, 23 Dec 2016, Peter Zijlstra wrote:

> On Fri, Dec 16, 2016 at 03:12:55PM -0800, Vikas Shivappa wrote:
>> +Continuous monitoring
>> +---------------------
>> +A new file cont_monitoring is added to perf_cgroup which helps to enable
>> +cqm continuous monitoring. Enabling this field would start monitoring of
>> +the cgroup without perf being launched. This can be used for long term
>> +light weight monitoring of tasks/cgroups.
>> +
>> +To enable continuous monitoring of cgroup p1.
>> +#echo 1 > /sys/fs/cgroup/perf_event/p1/perf_event.cqm_cont_monitoring
>> +
>> +To disable continuous monitoring of cgroup p1.
>> +#echo 0 > /sys/fs/cgroup/perf_event/p1/perf_event.cqm_cont_monitoring
>> +
>> +To read the counters at the end of monitoring perf can be used.
>> +
>> +LAZY and NOLAZY Monitoring
>> +--------------------------
>> +LAZY:
>> +By default when monitoring is enabled, the RMIDs are not allocated
>> +immediately and allocated lazily only at the first sched_in.
>> +There are 2-4 RMIDs per logical processor on each package. So if a dual
>> +package has 48 logical processors, there would be upto 192 RMIDs on each
>> +package = total of 192x2 RMIDs.
>> +There is a possibility that RMIDs can runout and in that case the read
>> +reports an error since there was no RMID available to monitor for an
>> +event.
>> +
>> +NOLAZY:
>> +When user wants guaranteed monitoring, he can enable the 'monitoring
>> +mask' which is basically used to specify the packages he wants to
>> +monitor. The RMIDs are statically allocated at open and failure is
>> +indicated if RMIDs are not available.
>> +
>> +To specify monitoring on package 0 and package 1:
>> +#echo 0-1 > /sys/fs/cgroup/perf_event/p1/perf_event.cqm_mon_mask
>> +
>> +An error is thrown if packages not online are specified.
>
> I very much dislike both those for adding files to the perf cgroup.
> Drivers should really not do that.

Is the continuous monitoring the issue or the interface (adding a file in 
perf_cgroup) ? I have not mentioned in the documentaion but this continuous 
monitoring/ monitoring mask applies only to cgroup in this patch and hence we 
thought a good place for that is in the cgroup itself because its per cgroup.

For task events , this wont apply and we are thinking of providing a prctl based 
interface for user to toggle the continous monitoring ..

>
> I absolutely hate the second because events already have affinity.

This applies to continuous monitoring as well when there are no events 
associated. Meaning if the monitoring mask is chosen and user tries to enable 
continuous monitoring using the cgrp->cont_mon - all RMIDs are allocated 
immediately. the mon_mask provides a way for the user to have guarenteed RMIDs 
for both that have events and for continoous monitoring(no perf event 
associated) 
(assuming user uses it when user knows he would definitely use it.. or else 
there is LAZY mode)

Again this is cgroup specific and wont apply to task events and is needed when 
there are no events associated.

Thanks,
Vikas

>
> I can't see this happening.
>