[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160226113304.GA6356@twins.programming.kicks-ass.net>
Date: Fri, 26 Feb 2016 12:33:04 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Russell King <linux@....linux.org.uk>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org,
linux-api <linux-api@...r.kernel.org>,
Paul Turner <pjt@...gle.com>, Andrew Hunter <ahh@...gle.com>,
Andy Lutomirski <luto@...capital.net>,
Andi Kleen <andi@...stfloor.org>,
Dave Watson <davejwatson@...com>, Chris Lameter <cl@...ux.com>,
Ben Maurer <bmaurer@...com>, rostedt <rostedt@...dmis.org>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Josh Triplett <josh@...htriplett.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will.deacon@....com>,
Michael Kerrisk <mtk.manpages@...il.com>
Subject: Re: [PATCH v4 1/5] getcpu_cache system call: cache CPU number of
running thread
On Thu, Feb 25, 2016 at 05:17:51PM +0000, Mathieu Desnoyers wrote:
> ----- On Feb 25, 2016, at 12:04 PM, Peter Zijlstra peterz@...radead.org wrote:
>
> > On Thu, Feb 25, 2016 at 04:55:26PM +0000, Mathieu Desnoyers wrote:
> >> ----- On Feb 25, 2016, at 4:56 AM, Peter Zijlstra peterz@...radead.org wrote:
> >> The restartable sequences are intrinsically designed to work
> >> on per-cpu data, so they need to fetch the current CPU number
> >> within the rseq critical section. This is where the getcpu_cache
> >> system call becomes very useful when combined with rseq:
> >> getcpu_cache allows reading the current CPU number in a
> >> fraction of cycle.
> >
> > Yes yes, I know how restartable sequences work.
> >
> > But what I worry about is that they want a cpu number and a sequence
> > number, and for performance it would be very good if those live in the
> > same cacheline.
> >
> > That means either getcpu needs to grow a seq number, or restartable
> > sequences need to _also_ provide the cpu number.
>
> If we plan things well, we could have both the cpu number and the
> seqnum in the same cache line, registered by two different system
> calls. It's up to user-space to organize those two variables
> to fit within the same cache-line.
I feel this is more fragile than needed. Why not do a single systemcall
that does both?
> getcpu_cache GETCPU_CACHE_SET operation takes the address where
> the CPU number should live as input.
>
> rseq system call could do the same for the seqnum address.
So I really don't like that, that means we have to track more kernel
state -- we have to carry two pointers instead of one, we have to have
more update functions etc..
That just increases the total overhead of all of this.
> The question becomes: how do we introduce this to user-space,
> considering that only a single address per thread is allowed
> for each of getcpu_cache and rseq ?
>
> If both CPU number and seqnum are centralized in a TLS within
> e.g. glibc, that would be OK, but if we intend to allow libraries
> or applications to directly register their own getcpu_cache
> address and/or rseq, we may end up in situations where we have
> to fallback on using two different cache-lines. But how much
> should we care about performance in cases where non-generic
> libraries directly use those system calls ?
>
> Thoughts ?
Yeah, not sure, but that is a separate problem. Both your proposed code
and the rseq code have this. Having them separate system calls just
increases the amount of ways you can do it wrong.
Powered by blists - more mailing lists