[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1707060837580.1771@nanos>
Date: Thu, 6 Jul 2017 08:51:59 +0200 (CEST)
From: Thomas Gleixner <tglx@...utronix.de>
To: Tony Luck <tony.luck@...il.com>
cc: Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
X86-ML <x86@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"H. Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>,
ravi.v.shankar@...el.com, vikas.shivappa@...el.com,
"Yu, Fenghua" <fenghua.yu@...el.com>, andi.kleen@...el.com
Subject: Re: [PATCH 08/21] x86/intel_rdt/cqm: Add RMID(Resource monitoring
ID) management
On Wed, 5 Jul 2017, Tony Luck wrote:
> > In case that a RMID was never used on a particular package, the state check
> > forces an IPI on all packages unconditionally. That's suboptimal at least.
> >
> > We know on which package a given RMID was used, so we could restrict the
> > checks to exactly these packages, but I'm not sure it's worth the
> > trouble. We might at least document that and explain why this is
> > implemented in that way.
>
> We only allocate RMIDs when a user makes a directory. I don't think
> we should consider options that slow down context switch in order to
> keep track of which packages were used just to make mkdir(2) a bit faster
> in the case where we need to check the limbo list.
It's not about speeding up mkdir. It's about preventing IPIs which walk a
list of hundreds of rmid entries in the limbo list. I tested the current
pile on a BDW which has 143 RMIDs and the list walk plus the WRMSR/RDMSR
takes > 100us in IPI context. That's just crap, seriously.
> We could make the check of the limbo list less costly by using a bitmask
> to keep track of which packages have already found that the llc_occupancy
> is below the threshold. But I'd question whether the extra complexity in the
> code was really worth it.
Delegating the check to an IPI which only gets invoked when we ran out of
free RMIDs is the problem and that needs to be fixed.
Whether we optimize it for avoiding the work on packages which did not use
the RMID can be discussed, but replacing that current approach of
delegating a full list walk to an IPI is not so much debatable.
OTOH, the set_bit operation in the context switch path on a per cpu local
variable is aside of dirtying a cacheline negligible vs. the MSR write
itself. And it burdens only the tasks which use monitoring and not the
normal and interesting non monitoring case.
Thanks,
tglx
Powered by blists - more mailing lists