[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190125095955.GA4500@hirez.programming.kicks-ass.net>
Date: Fri, 25 Jan 2019 10:59:55 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Alexei Starovoitov <ast@...nel.org>, davem@...emloft.net,
daniel@...earbox.net, jakub.kicinski@...ronome.com,
netdev@...r.kernel.org, kernel-team@...com, mingo@...hat.com,
will.deacon@....com, Paul McKenney <paulmck@...ux.vnet.ibm.com>,
jannh@...gle.com
Subject: Re: [PATCH v4 bpf-next 1/9] bpf: introduce bpf_spin_lock
On Thu, Jan 24, 2019 at 03:58:59PM -0800, Alexei Starovoitov wrote:
> On Thu, Jan 24, 2019 at 07:01:09PM +0100, Peter Zijlstra wrote:
> > > - on architectures that don't support queued_spin_lock trivial lock is used.
> > > Note that arch_spin_lock cannot be used, since not all archs agree that
> > > zero == unlocked and sizeof(arch_spinlock_t) != sizeof(__u32).
> >
> > I really don't much like direct usage of qspinlock; esp. not as a
> > surprise.
> >
> > Why does it matter if 0 means unlocked; that's what
> > __ARCH_SPIN_LOCK_UNLOCKED is for.
> >
> > I get the sizeof(__u32) thing, but why not key off of that?
>
> what do you mean by 'key off of that' ?
> to use arch_spinlock_t instead of qspinlock ?
> That was my first attempt, but then I painfully found that
> its size on parisc is 16 bytes and we're not going to penalize bpf
> to waste that much space because of single architecture.
> sizeof(arch_spinlock_t) can be 1 byte too (on sparc).
PowerPC has 8 bytes for some config options IIRC.
> That would fit in __u32, but I figured it's cleaner to use qspinlock
> on all archs that support it and dumb_spin_lock on archs that dont.
>
> Another option is use to arch_spinlock_t when its sizeof==4
That's what I meant.
> and use dumb_spin_lock otherwise.
> It's doable, but imo still less clean than using qspinlock
> due to zero init. Since zero init is a lot less map work
> that zero inits all elements already.
>
> If arch_spinlock_t is used than at map init time we would need to
> walk all elements and do __ARCH_SPIN_LOCK_UNLOCKED assignment
> (and maps can have millions of elements).
> Not horrible, but 100% waste of cycles for x86/arm64 where qspinlock
> is used. Such waste can be workaround further by doing ugly
> #idef __ARCH_SPIN_LOCK_UNLOCKED == 0 -> don't do init loop.
> And then add another #ifdef for archs with sizeof(arch_spinlock_t)!=4
> to keep zero init for all map types that support bpf_spin_lock
> via dumb_spin_lock.
> Clearly at that point we're getting into ugliness everywhere.
> Hence I've used qspinlock directly.
OK; I see.. but do these locks really have enough contention to run into
trouble with the simple test-and-set lock?
[ I tried to propose a simple ticket lock, but then realized the virt
archs (s390,powerpc,etc.) would hate that and deleted everything again
]
Argh, what a mess..
Powered by blists - more mailing lists