[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALcN6mgcyEukFbu-q4OR2o0DTjMrs9-9hPy_PM04LntPjiKr+g@mail.gmail.com>
Date: Mon, 6 Feb 2017 16:33:27 -0800
From: David Carrillo-Cisneros <davidcc@...gle.com>
To: "Luck, Tony" <tony.luck@...el.com>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
"Shivappa, Vikas" <vikas.shivappa@...el.com>,
Stephane Eranian <eranian@...gle.com>,
linux-kernel <linux-kernel@...r.kernel.org>,
x86 <x86@...nel.org>, "hpa@...or.com" <hpa@...or.com>,
Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
"Shankar, Ravi V" <ravi.v.shankar@...el.com>,
"Yu, Fenghua" <fenghua.yu@...el.com>,
"Kleen, Andi" <andi.kleen@...el.com>,
"Anvin, H Peter" <h.peter.anvin@...el.com>
Subject: Re: [PATCH 00/12] Cqm2: Intel Cache quality monitoring fixes
On Mon, Feb 6, 2017 at 3:27 PM, Luck, Tony <tony.luck@...el.com> wrote:
>> cgroup mode gives a per-CPU breakdown of event and running time, the
>> tool aggregates it into running time vs event count. Both per-cpu
>> breakdown and the aggregate are useful.
>>
>> Piggy-backing on perf's cgroup mode would give us all the above for free.
>
> Do you have some sample output from a perf run on a cgroup measuring a
> "normal" event showing what you get?
# perf stat -I 1000 -e cycles -a -C 0-1 -A -x, -G /
1.000116648,CPU0,20677864,,cycles,/
1.000169948,CPU1,24760887,,cycles,/
2.000453849,CPU0,36120862,,cycles,/
2.000480259,CPU1,12535575,,cycles,/
3.000664762,CPU0,7564504,,cycles,/
3.000692552,CPU1,7307480,,cycles,/
>
> I think that requires that we still go through perf ->start() and ->stop() functions
> to know how much time we spent running. I thought we were looking at bundling
> the RMID updates into the same spot in sched() where we switch the CLOSID.
> More or less at the "start" point, but there is no "stop". If we are switching between
> runnable processes, it amounts to pretty much the same thing ... except we bill
> to someone all the time instead of having a gap in the context switch where we
> stopped billing to the old task and haven't started billing to the new one yet.
Another problem is that it will require a perf event all the time for
timing measurements to be consistent with RMID measurements.
The only sane option I can come up is to do timing in RDT the way perf
cgroup does it (keep a per-cpu time that increases with local clock's
delta). A reader can add the times for all CPUs in cpu_mask.
>
> But if we idle ... then we don't "stop". Shouldn't matter much from a measurement
> perspective because idle won't use cache or consume bandwidth. But we'd count
> that time as "on cpu" for the last process to run.
I may be missing something basic but isn't __switch_to called when
switching to the idle task? that will update the CLOSID and RMID to
whatever the idle task in on, isnt it?
Thanks,
David
Powered by blists - more mailing lists