[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.1802201058350.24268@nanos.tec.linutronix.de>
Date: Tue, 20 Feb 2018 11:00:49 +0100 (CET)
From: Thomas Gleixner <tglx@...utronix.de>
To: Reinette Chatre <reinette.chatre@...el.com>
cc: fenghua.yu@...el.com, tony.luck@...el.com, gavin.hindman@...el.com,
vikas.shivappa@...ux.intel.com, dave.hansen@...el.com,
mingo@...hat.com, hpa@...or.com, x86@...nel.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH V2 11/22] x86/intel_rdt: Associate pseudo-locked
regions with its domain
On Mon, 19 Feb 2018, Reinette Chatre wrote:
> On 2/19/2018 3:19 PM, Thomas Gleixner wrote:
> > On Mon, 19 Feb 2018, Reinette Chatre wrote:
> >> On 2/19/2018 1:19 PM, Thomas Gleixner wrote:
> >>> On Tue, 13 Feb 2018, Reinette Chatre wrote:
> >>>
> >>>> After a pseudo-locked region is locked it needs to be associated with
> >>>> the RDT domain representing the pseudo-locked cache so that its life
> >>>> cycle can be managed correctly.
> >>>>
> >>>> Only a single pseudo-locked region can exist on any cache instance so we
> >>>> maintain a single pointer to a pseudo-locked region from each RDT
> >>>> domain.
> >>>
> >>> Why is only a single pseudo locked region possible?
> >>
> >> The setup of a pseudo-locked region requires the usage of wbinvd. If a
> >> second pseudo-locked region is thus attempted it will evict the
> >> pseudo-locked data of the first.
> >
> > Why does it neeed wbinvd? wbinvd is a big hammer. What's wrong with clflush?
>
> wbinvd is required by this hardware supported feature but limited to the
> creation of the pseudo-locked region. An administrator could dedicate a
> portion of cache to pseudo-locking and applications using this region
> can come and go. The pseudo-locked region lifetime need not be tied to
> application lifetime. The pseudo-locked region could be set up once on
> boot and remain for lifetime of system.
>
> Even so, understanding that it is a big hammer I did explore the
> alternatives. Trying clflush, clflushopt, as well as clwb. Finding them
> all to perform poorly(*) I went further to explore if it is possible to
> use these other instructions with some additional work in support to
> make them perform as well as wbinvd. The additional work included,
> looping over the data more times than done for wbinvd, reducing the size
> of memory locked in relationship to cache size, unused spacing between
> pseudo-locked region and other regions, unmapped memory at end of
> pseudo-locked region.
>
> In addition to the above research from my side I also followed up with
> the CPU architects directly to question the usage of these instructions
> instead of wbinvd.
What was their answer? This really wants a proper explanation and not just
experimentation results as it makes absolutely no sense at all.
Thanks,
tglx
Powered by blists - more mailing lists