lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200717161145.GA1150454@rowland.harvard.edu>
Date:   Fri, 17 Jul 2020 12:11:45 -0400
From:   Alan Stern <stern@...land.harvard.edu>
To:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc:     Nicholas Piggin <npiggin@...il.com>, paulmck <paulmck@...nel.org>,
        Anton Blanchard <anton@...abs.org>,
        Arnd Bergmann <arnd@...db.de>,
        linux-arch <linux-arch@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        linux-mm <linux-mm@...ck.org>,
        linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
        Andy Lutomirski <luto@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>, x86 <x86@...nel.org>
Subject: Re: [RFC PATCH 4/7] x86: use exit_lazy_tlb rather than
 membarrier_mm_sync_core_before_usermode

> > I agree with Nick: A memory barrier is needed somewhere between the
> > assignment at 6 and the return to user mode at 8.  Otherwise you end up
> > with the Store Buffer pattern having a memory barrier on only one side,
> > and it is well known that this arrangement does not guarantee any
> > ordering.
> 
> Yes, I see this now. I'm still trying to wrap my head around why the memory
> barrier at the end of membarrier() needs to be paired with a scheduler
> barrier though.

The memory barrier at the end of membarrier() on CPU0 is necessary in 
order to enforce the guarantee that any writes occurring on CPU1 before 
the membarrier() is executed will be visible to any code executing on 
CPU0 after the membarrier().  Ignoring the kthread issue, we can have:

	CPU0			CPU1
				x = 1
				barrier()
				y = 1
	r2 = y
	membarrier():
	  a: smp_mb()
	  b: send IPI		IPI-induced mb
	  c: smp_mb()
	r1 = x

The writes to x and y are unordered by the hardware, so it's possible to 
have r2 = 1 even though the write to x doesn't execute until b.  If the 
memory barrier at c is omitted then "r1 = x" can be reordered before b 
(although not before a), so we get r1 = 0.  This violates the guarantee 
that membarrier() is supposed to provide.

The timing of the memory barrier at c has to ensure that it executes 
after the IPI-induced memory barrier on CPU1.  If it happened before 
then we could still end up with r1 = 0.  That's why the pairing matters.

I hope this helps your head get properly wrapped.  :-)

Alan Stern

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ