linux-kernel - Re: csky: smp_mb__after

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20200807081253.GD2674@hirez.programming.kicks-ass.net>
Date:   Fri, 7 Aug 2020 10:12:53 +0200
From:   peterz@...radead.org
To:     Guo Ren <guoren@...nel.org>
Cc:     Ren Guo <ren_guo@...ky.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-csky@...r.kernel.org, mathieu.desnoyers@...icios.com,
        Will Deacon <will@...nel.org>
Subject: Re: csky: smp_mb__after_spinlock

On Fri, Aug 07, 2020 at 08:22:36AM +0800, Guo Ren wrote:
> Hi Peter,
> 
> On Thu, Aug 6, 2020 at 3:53 AM <peterz@...radead.org> wrote:
> >
> > Hi,
> >
> > While doing an audit of smp_mb__after_spinlock, I found that csky
> > defines it, why?
> >
> > CSKY only has smp_mb(), it doesn't override __atomic_acquire_fence or
> > otherwise special cases it's atomic*_acquire() primitives. It has an
> > explicit smp_mb() in its arch_spin_lock().
> 
> Yes, csky didn't implement ACQUIRE/RELEASE in spinlock.h. So it is a
> redundant and side-effect wrong macro, we should remove it to fixup.
> 
> TODO:
>  - implement csky's ACQUIRE/RELEASE barrier

Fair enough; please Cc me when you get to it.

> > Also, why have two implementations of all the locking?
> 
> I just kept my baby's codes :P. Now, we could remove it and just keep
> the ticket's one.
> 
> BTW, I want to change the spinlock to qspinlock, but csky only has
> 32-bit atomic operation in hardware.

Yeah, that's okay, you can do 'short' atomics using wider RmW.

> Any plan to deal with this in spinlock?
> 
> Maybe for the embedded scenario, qspinlock seems a bit heavy,

I tend to agree, qspinlock only really makes sense when you have
multiple cache domains and NUMA makes it shine.

> are any
> tickets-like comm spinlock infrastructures in the plan?

No. ticket locks are 'simple' and their implementation fairly straight
forward. Also, since the generic code only has cmpxchg, a generic ticket
lock would be sub-optimal on LL/SC (or even x86 for that matter).

It would look something like this, which I think is pretty terrible for
just about any arch compared to what you can do with inline asm.

struct ticket_lock {
	union {
		atomic_t	val;
		struct {
#ifdef __LITTLE_ENDIAN
			u16	head;
			u16	tail;
#else
			u16	tail;
			u16	head;
#endif
		};
	};
};

void ticket_lock(struct ticket_lock *lock)
{
	unsigned int val = atomic_read(&lock->val);
	struct ticket_lock lock;

	do {
		lock.val = ATOMIC_INIT(val);
		lock.head++;
	} while (!atomic_try_cmpxchg_relaxed(&lock->lock, &val, lock.val));

	atomic_cond_read_acquire(&lock->val, (VAL >> 16) == lock.head);
}

void ticket_unlock(struct ticket_lock *lock)
{
	atomic_fetch_inc_release(&lock->val);
}