[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1389380100.32504.172.camel@ppwaskie-mobl.amr.corp.intel.com>
Date: Fri, 10 Jan 2014 18:55:11 +0000
From: "Waskiewicz Jr, Peter P" <peter.p.waskiewicz.jr@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: Tejun Heo <tj@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
"Ingo Molnar" <mingo@...hat.com>, "H. Peter Anvin" <hpa@...or.com>,
Li Zefan <lizefan@...wei.com>,
"containers@...ts.linux-foundation.org"
<containers@...ts.linux-foundation.org>,
"cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 0/4] x86: Add Cache QoS Monitoring (CQM) support
On Tue, 2014-01-07 at 22:12 +0100, Peter Zijlstra wrote:
> Maybe its me (its late) but I can't follow.
>
> So if every cacheline is tagged with both CR3 and RMID (on all levels --
> I get that it needs to propagate etc..) then you can, upon observing a
> new CR3,RMID pair, iterate the entire cache for the matching CR3 and
> update its RMID.
>
> This, while expensive, would fairly quickly propagate changes.
>
> Now I'm not at all sure cachelines are CR3 tagged.
>
> The above has downsides in that you cannot use RMIDs to slice into
> processes, where a pure RMID (without CR3 relation, even if cachelines
> are CR3 tagged) can slice processes -- note that process is an
> address-space/CR3 collection of threads.
>
> A pure RMID tagging solution would not allow the immediate update and
> would require on demand updates on new cacheline usage.
>
> This makes switching RMIDs effects slower to propagate.
> > > The other possible interpretation is that it updates on-demand whenever
> > > it touches a cacheline. But in that case, how do you deal with the
> > > non-exclusive states? Does the last RMID to touch a non-exclusive
> > > cacheline simply claim the entire line?
> >
> > I don't believe it claims the whole line; I had that exact discussion
> > awhile ago with the CPU architect, and this didn't appear broken before.
> > I will ask him again though since that discussion was over a year ago.
> >
> > > But that doesn't avoid the problem; because as soon as you change the
> > > PQR_ASSOC RMID you still need to go run for a while to touch 'all' your
> > > lines.
> > >
> > > This duration is indeterminate; which again brings us back to needing to
> > > first wipe the entire cache.
> >
> > I asked hpa if there is a clean way to do that outside of a WBINVD, and
> > the answer is no.
> >
> > I've sent the two outstanding questions off to the CPU architect, I'll
> > let you know what he says once I hear.
>
> Much appreciated; so I'd like a complete description of how this thing
> works, with in particular when exactly lines are tagged.
I've spoken with the CPU architect, and he's set me straight. I was
getting some simulation data and reality mixed up, so apologies.
The cacheline is tagged with the RMID being tracked when it's brought
into the cache. That is the only time it's tagged, it does not get
updated (I was looking at data showing impacts if it was updated).
If there are frequent RMID updates for a particular process, then there
is the possibility that any remaining old data for that process can be
accounted for on a different RMID. This really is workload dependent,
and my architect provided their data showing that this occurrence is
pretty much in the noise.
Also, I did ask about the granularity of the RMID, and it is
per-cacheline. So if there is a non-exclusive cacheline, then the
occupancy data in the other part of the cacheline will count against the
RMID.
> So my current mental model would tag a line with the current (ASSOC)
> RMID on:
> - load from DRAM -> L*, even for non-exclusive
> - any to exclusive transition
>
> The result of such rules is that when the effective RMID of a task
> changes it takes an indeterminate amount of time before the residency
> stats reflect reality again.
>
> Furthermore; the IA32_QM_CTR is a misnomer as its a VALUE not a COUNTER.
> Not to mention the entire SDM 17.14.2 section is a mess; it purports to
> describe how to detect the thing using CPUID but then also maybe
> describes how to program it.
I've given this feedback to the section owner in the SDM. There is an
update due this month, and there will be some updates to this section
(along with some additions).
I should have my alternate implementation sent out shortly, just working
a few kinks out of it. This is the proc-based and sysfs-based interface
that will rely on a userspace program to handle the logic of grouping
and assigning stuff together.
Cheers,
-PJ
--
PJ Waskiewicz Open Source Technology Center
peter.p.waskiewicz.jr@...el.com Intel Corp.
Powered by blists - more mailing lists