[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALcN6miebWcuLYcA19CN88bJ1pT8zRksFm-f3KFvQcemkqJa8Q@mail.gmail.com>
Date: Fri, 10 Mar 2017 17:53:23 -0800
From: David Carrillo-Cisneros <davidcc@...gle.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Stephane Eranian <eranian@...gle.com>,
"Luck, Tony" <tony.luck@...el.com>,
Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
"Shivappa, Vikas" <vikas.shivappa@...el.com>,
"x86@...nel.org" <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"hpa@...or.com" <hpa@...or.com>,
"mingo@...nel.org" <mingo@...nel.org>,
"peterz@...radead.org" <peterz@...radead.org>,
"Shankar, Ravi V" <ravi.v.shankar@...el.com>,
"Yu, Fenghua" <fenghua.yu@...el.com>,
"Kleen, Andi" <andi.kleen@...el.com>
Subject: Re: [PATCH 1/1] x86/cqm: Cqm requirements
>
> Fine. So we need this for ONE particular use case. And if that is not well
> documented including the underlying mechanics to analyze the data then this
> will be a nice source of confusion for Joe User.
>
> I still think that this can be done differently while keeping the overhead
> small.
>
> You look at this from the existing perf mechanics which require high
> overhead context switching machinery. But that's just wrong because that's
> not how the cache and bandwidth monitoring works.
>
> Contrary to the other perf counters, CQM and MBM are based on a context
> selectable set of counters which do not require readout and reconfiguration
> when the switch happens.
>
> Especially with CAT in play, the context switch overhead is there already
> when CAT partitions need to be switched. So switching the RMID at the same
> time is basically free, if we are smart enough to do an equivalent to the
> CLOSID context switch mechanics and ideally combine both into a single MSR
> write.
>
> With that the low overhead periodic sampling can read N counters which are
> related to the monitored set and provide N separate results. For bandwidth
> the aggregation is a simple ADD and for cache residency it's pointless.
>
> Just because perf was designed with the regular performance counters in
> mind (way before that CQM/MBM stuff came around) does not mean that we
> cannot change/extend that if it makes sense.
>
> And looking at the way Cache/Bandwidth allocation and monitoring works, it
> makes a lot of sense. Definitely more than shoving it into the current mode
> of operandi with duct tape just because we can.
>
You made a point. The use case I described can be better served with
the low overhead monitoring groups that Fenghua is working on. Then
that info can be merged with the per-CPU profile collected for non-RDT
events.
I am ok removing the perf-like CPU filtering from the requirements.
Thanks,
David
Powered by blists - more mailing lists