[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180928152646.mzjpwtj3qqvdnrlx@linutronix.de>
Date: Fri, 28 Sep 2018 17:26:46 +0200
From: Kurt Kanzenbach <kurt.kanzenbach@...utronix.de>
To: Will Deacon <will.deacon@....com>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
linux-kernel@...r.kernel.org,
Daniel Wagner <daniel.wagner@...mens.com>,
Peter Zijlstra <peterz@...radead.org>, x86@...nel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
"H. Peter Anvin" <hpa@...or.com>,
Boqun Feng <boqun.feng@...il.com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Mark Rutland <mark.rutland@....com>
Subject: Re: [Problem] Cache line starvation
On Fri, Sep 28, 2018 at 11:05:21AM +0200, Kurt Kanzenbach wrote:
> Hi Thomas,
>
> On Thu, Sep 27, 2018 at 04:47:47PM +0200, Thomas Gleixner wrote:
> > On Thu, 27 Sep 2018, Kurt Kanzenbach wrote:
> > > On Thu, Sep 27, 2018 at 04:25:47PM +0200, Kurt Kanzenbach wrote:
> > > > However, the issue still triggers fine. With stress-ng we're able to
> > > > generate latency in millisecond range. The only workaround we've found
> > > > so far is to add a "delay" in cpu_relax().
> > >
> > > It might interesting for you, how we added the delay. We've used:
> > >
> > > static inline void cpu_relax(void)
> > > {
> > > volatile int i = 0;
> > >
> > > asm volatile("yield" ::: "memory");
> > > while (i++ <= 1000);
> > > }
> > >
> > > Of course it's not efficient, but it works.
> >
> > I wonder if it's just the store on the stack which makes it work. I've seen
> > that when instrumenting x86. When the careful instrumentation just stayed
> > in registers it failed. Once it was too much and stack got involved it
> > vanished away.
>
> I've performed more tests: Adding a store to a global variable just
> before calling cpu_relax() doesn't help. Furthermore, adding up to 20
> yield instructions (just like you did on x86) didn't work either.
In addition, the stress-ng test triggers on v4.14-rt and v4.18-rt as
well.
As v4.18-rt still uses the old spin lock implementation, I've backported
the qspinlock implementation to v4.18-rt. The commits I've identified
are:
- 598865c5f32d ("arm64: barrier: Implement smp_cond_load_relaxed")
- c11090474d70 ("arm64: locking: Replace ticket lock implementation with qspinlock")
Using these commits it's still possible to trigger the issue. But it
takes longer.
Did I miss anything?
Thanks,
Kurt
Powered by blists - more mailing lists