[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190129091654.GD28485@hirez.programming.kicks-ass.net>
Date: Tue, 29 Jan 2019 10:16:54 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Alexei Starovoitov <ast@...nel.org>, davem@...emloft.net,
daniel@...earbox.net, jakub.kicinski@...ronome.com,
netdev@...r.kernel.org, kernel-team@...com, mingo@...hat.com,
will.deacon@....com, Paul McKenney <paulmck@...ux.vnet.ibm.com>,
jannh@...gle.com
Subject: Re: bpf memory model. Was: [PATCH v4 bpf-next 1/9] bpf: introduce
bpf_spin_lock
On Mon, Jan 28, 2019 at 01:56:24PM -0800, Alexei Starovoitov wrote:
> On Mon, Jan 28, 2019 at 10:24:08AM +0100, Peter Zijlstra wrote:
> > Ah, but the loop won't be in the BPF program itself. The BPF program
> > would only have had the BPF_SPIN_LOCK instruction, the JIT them emits
> > code similar to queued_spin_lock()/queued_spin_unlock() (or calls to
> > out-of-line versions of them).
>
> As I said we considered exactly that and such approach has a lot of downsides
> comparing with the helper approach.
> Pretty much every time new feature is added we're evaluating whether it
> should be new instruction or new helper. 99% of the time we go with new helper.
Ah; it seems I'm confused on helper vs instruction. As in, I've no idea
what a helper is.
> > There isn't anything that mandates the JIT uses the exact same locking
> > routines the interpreter does, is there?
>
> sure. This bpf_spin_lock() helper can be optimized whichever way the kernel wants.
> Like bpf_map_lookup_elem() call is _inlined_ by the verifier for certain map types.
> JITs don't even need to do anything. It looks like function call from bpf prog
> point of view, but in JITed code it is a sequence of native instructions.
>
> Say tomorrow we find out that bpf_prog->bpf_spin_lock()->queued_spin_lock()
> takes too much time then we can inline fast path of queued_spin_lock
> directly into bpf prog and save function call cost.
OK, so then the JIT can optimize helpers. Would it not make sense to
have the simple test-and-set spinlock in the generic code and have the
JITs use arch_spinlock_t where appropriate?
Powered by blists - more mailing lists