netdev - Re: bpf memory model. Was: [PATCH v4 bpf-next 1/9] bpf: introduce bpf_spin

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190130085850.GA2278@hirez.programming.kicks-ass.net>
Date:   Wed, 30 Jan 2019 09:58:50 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Alexei Starovoitov <ast@...nel.org>, davem@...emloft.net,
        daniel@...earbox.net, jakub.kicinski@...ronome.com,
        netdev@...r.kernel.org, kernel-team@...com, mingo@...hat.com,
        will.deacon@....com, Paul McKenney <paulmck@...ux.vnet.ibm.com>,
        jannh@...gle.com
Subject: Re: bpf memory model. Was: [PATCH v4 bpf-next 1/9] bpf: introduce
 bpf_spin_lock

On Tue, Jan 29, 2019 at 06:32:13PM -0800, Alexei Starovoitov wrote:
> On Tue, Jan 29, 2019 at 10:16:54AM +0100, Peter Zijlstra wrote:
> > On Mon, Jan 28, 2019 at 01:56:24PM -0800, Alexei Starovoitov wrote:
> > > On Mon, Jan 28, 2019 at 10:24:08AM +0100, Peter Zijlstra wrote:
> > 
> > > > Ah, but the loop won't be in the BPF program itself. The BPF program
> > > > would only have had the BPF_SPIN_LOCK instruction, the JIT them emits
> > > > code similar to queued_spin_lock()/queued_spin_unlock() (or calls to
> > > > out-of-line versions of them).
> > > 
> > > As I said we considered exactly that and such approach has a lot of downsides
> > > comparing with the helper approach.
> > > Pretty much every time new feature is added we're evaluating whether it
> > > should be new instruction or new helper. 99% of the time we go with new helper.
> > 
> > Ah; it seems I'm confused on helper vs instruction. As in, I've no idea
> > what a helper is.
> 
> bpf helper is a normal kernel function that can be called from bpf program.
> In assembler it's a direct function call.

Ah, it is what is otherwise known as a standard library,

> > > > There isn't anything that mandates the JIT uses the exact same locking
> > > > routines the interpreter does, is there?
> > > 
> > > sure. This bpf_spin_lock() helper can be optimized whichever way the kernel wants.
> > > Like bpf_map_lookup_elem() call is _inlined_ by the verifier for certain map types.
> > > JITs don't even need to do anything. It looks like function call from bpf prog
> > > point of view, but in JITed code it is a sequence of native instructions.
> > > 
> > > Say tomorrow we find out that bpf_prog->bpf_spin_lock()->queued_spin_lock()
> > > takes too much time then we can inline fast path of queued_spin_lock
> > > directly into bpf prog and save function call cost.
> > 
> > OK, so then the JIT can optimize helpers. Would it not make sense to
> > have the simple test-and-set spinlock in the generic code and have the
> > JITs use arch_spinlock_t where appropriate?
> 
> I think that pretty much the same as what I have with qspinlock.
> Instead of taking a risk how JIT writers implement bpf_spin_lock optimization
> I'm using qspinlock on architectures that are known to support it.

I see the argument for it...

> So instead of starting with dumb test-and-set there will be faster
> qspinlock from the start on x86, arm64 and few others archs.
> Those are the archs we care about the most anyway. Other archs can take
> time to optimize it (if optimizations are necessary at all).
> In general hacking JITs is much harder and more error prone than
> changing core and adding helpers. Hence we avoid touching JITs
> as much as possible.

So archs/JITs are not trivially able to override those helper functions?
Because for example ARM (32bit) doesn't do qspinlock but it's
arch_spinlock_t is _much_ better than a TAS lock.

> Like map_lookup inlining optimization we do only when JIT is on.
> And we do it purely in the generic core. See array_map_gen_lookup().
> We generate bpf instructions only to feed them into JITs so they
> can replace them with native asm. That is much easier to implement
> correctly than if we were doing inlining in every JIT independently.

That seems prudent and fairly painful at the same time. I see why you
would want to share as much of it as possible, but being restricted to
BPF instructions seems very limiting.

Anyway, I'll not press the issue; but I do think that arch specific
helpers would make sense here.