lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 28 Jan 2021 12:24:33 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Alexander Sverdlin <alexander.sverdlin@...ia.com>
Cc:     Will Deacon <will@...nel.org>, Ingo Molnar <mingo@...hat.com>,
        linux-kernel@...r.kernel.org, Russell King <linux@...linux.org.uk>,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH 1/2] qspinlock: Ensure writes are pushed out of core
 write buffer

On Thu, Jan 28, 2021 at 08:36:24AM +0100, Alexander Sverdlin wrote:

> >> diff --git a/kernel/locking/mcs_spinlock.h b/kernel/locking/mcs_spinlock.h
> >> index 5e10153..10e497a 100644
> >> --- a/kernel/locking/mcs_spinlock.h
> >> +++ b/kernel/locking/mcs_spinlock.h
> >> @@ -89,6 +89,11 @@ void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
> >>  		return;
> >>  	}
> >>  	WRITE_ONCE(prev->next, node);
> >> +	/*
> >> +	 * This is necessary to make sure that the corresponding "while" in the
> >> +	 * mcs_spin_unlock() doesn't loop forever
> >> +	 */
> >> +	smp_wmb();
> > If it loops forever, that's broken hardware design; store buffers need to
> > drain. I don't think we should add unconditional barriers to bodge this.
> 
> The comment is a bit exaggerating the situation, but it's undeterministic and you see the
> measurements above. Something is wrong in the qspinlocks code, please consider this patch
> "RFC", but something has to be done here.

The qspinlock code has been TLA+ modelled and has had extensive memory
ordering analysis. It has had lots of runtime on extremely large x86,
arm64, and Power machines. I'm fairly confident there is nothing wrong.

What I do think is more likely is that your platform is broken, it
wouldn't be the first MIPS that's got store-buffers completely wrong,
see commit:

  a30718868915 ("MIPS: Change definition of cpu_relax() for Loongson-3")

Working around micro arch store-buffer issues is not something the
generic code is for.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ