linux-kernel - Re: [PATCH 01/14] x86/cqm: Intel Resource Monitoring Documentation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20161223203318.GU3107@twins.programming.kicks-ass.net>
Date:   Fri, 23 Dec 2016 21:33:18 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Shivappa Vikas <vikas.shivappa@...el.com>
Cc:     Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
        linux-kernel@...r.kernel.org, x86@...nel.org, tglx@...utronix.de,
        ravi.v.shankar@...el.com, tony.luck@...el.com,
        fenghua.yu@...el.com, andi.kleen@...el.com, davidcc@...gle.com,
        eranian@...gle.com, hpa@...or.com
Subject: Re: [PATCH 01/14] x86/cqm: Intel Resource Monitoring Documentation

On Fri, Dec 23, 2016 at 11:35:03AM -0800, Shivappa Vikas wrote:
> 
> Hello Peterz,
> 
> On Fri, 23 Dec 2016, Peter Zijlstra wrote:
> 
> >On Fri, Dec 16, 2016 at 03:12:55PM -0800, Vikas Shivappa wrote:
> >>+Continuous monitoring
> >>+---------------------
> >>+A new file cont_monitoring is added to perf_cgroup which helps to enable
> >>+cqm continuous monitoring. Enabling this field would start monitoring of
> >>+the cgroup without perf being launched. This can be used for long term
> >>+light weight monitoring of tasks/cgroups.
> >>+
> >>+To enable continuous monitoring of cgroup p1.
> >>+#echo 1 > /sys/fs/cgroup/perf_event/p1/perf_event.cqm_cont_monitoring
> >>+
> >>+To disable continuous monitoring of cgroup p1.
> >>+#echo 0 > /sys/fs/cgroup/perf_event/p1/perf_event.cqm_cont_monitoring
> >>+
> >>+To read the counters at the end of monitoring perf can be used.
> >>+
> >>+LAZY and NOLAZY Monitoring
> >>+--------------------------
> >>+LAZY:
> >>+By default when monitoring is enabled, the RMIDs are not allocated
> >>+immediately and allocated lazily only at the first sched_in.
> >>+There are 2-4 RMIDs per logical processor on each package. So if a dual
> >>+package has 48 logical processors, there would be upto 192 RMIDs on each
> >>+package = total of 192x2 RMIDs.
> >>+There is a possibility that RMIDs can runout and in that case the read
> >>+reports an error since there was no RMID available to monitor for an
> >>+event.
> >>+
> >>+NOLAZY:
> >>+When user wants guaranteed monitoring, he can enable the 'monitoring
> >>+mask' which is basically used to specify the packages he wants to
> >>+monitor. The RMIDs are statically allocated at open and failure is
> >>+indicated if RMIDs are not available.
> >>+
> >>+To specify monitoring on package 0 and package 1:
> >>+#echo 0-1 > /sys/fs/cgroup/perf_event/p1/perf_event.cqm_mon_mask
> >>+
> >>+An error is thrown if packages not online are specified.
> >
> >I very much dislike both those for adding files to the perf cgroup.
> >Drivers should really not do that.
> 
> Is the continuous monitoring the issue or the interface (adding a file in
> perf_cgroup) ? I have not mentioned in the documentaion but this continuous
> monitoring/ monitoring mask applies only to cgroup in this patch and hence
> we thought a good place for that is in the cgroup itself because its per
> cgroup.
> 
> For task events , this wont apply and we are thinking of providing a prctl
> based interface for user to toggle the continous monitoring ..

More fail..

> >
> >I absolutely hate the second because events already have affinity.
> 
> This applies to continuous monitoring as well when there are no events
> associated. Meaning if the monitoring mask is chosen and user tries to
> enable continuous monitoring using the cgrp->cont_mon - all RMIDs are
> allocated immediately. the mon_mask provides a way for the user to have
> guarenteed RMIDs for both that have events and for continoous monitoring(no
> perf event associated) (assuming user uses it when user knows he would
> definitely use it.. or else there is LAZY mode)
> 
> Again this is cgroup specific and wont apply to task events and is needed
> when there are no events associated.

So no, the problem is that a driver introduces special ABI and behaviour
that radically departs from the regular behaviour.

Also, the 'whoops you ran out of RMIDs, please reboot' thing totally and
completely blows.