[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210105132623.GB11108@willie-the-truck>
Date: Tue, 5 Jan 2021 13:26:23 +0000
From: Will Deacon <will@...nel.org>
To: Andy Lutomirski <luto@...nel.org>
Cc: Nicholas Piggin <npiggin@...il.com>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
X86 ML <x86@...nel.org>, Arnd Bergmann <arnd@...db.de>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Catalin Marinas <catalin.marinas@....com>,
linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
LKML <linux-kernel@...r.kernel.org>,
linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
Michael Ellerman <mpe@...erman.id.au>,
Paul Mackerras <paulus@...ba.org>,
stable <stable@...r.kernel.org>
Subject: Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()
Hi Andy,
Sorry for the slow reply, I was socially distanced from my keyboard.
On Mon, Dec 28, 2020 at 04:36:11PM -0800, Andy Lutomirski wrote:
> On Mon, Dec 28, 2020 at 4:11 PM Nicholas Piggin <npiggin@...il.com> wrote:
> > > +static inline void membarrier_sync_core_before_usermode(void)
> > > +{
> > > + /*
> > > + * XXX: I know basically nothing about powerpc cache management.
> > > + * Is this correct?
> > > + */
> > > + isync();
> >
> > This is not about memory ordering or cache management, it's about
> > pipeline management. Powerpc's return to user mode serializes the
> > CPU (aka the hardware thread, _not_ the core; another wrongness of
> > the name, but AFAIKS the HW thread is what is required for
> > membarrier). So this is wrong, powerpc needs nothing here.
>
> Fair enough. I'm happy to defer to you on the powerpc details. In
> any case, this just illustrates that we need feedback from a person
> who knows more about ARM64 than I do.
I think we're in a very similar boat to PowerPC, fwiw. Roughly speaking:
1. SYNC_CORE does _not_ perform any cache management; that is the
responsibility of userspace, either by executing the relevant
maintenance instructions (arm64) or a system call (arm32). Crucially,
the hardware will ensure that this cache maintenance is broadcast
to all other CPUs.
2. Even with all the cache maintenance in the world, a CPU could have
speculatively fetched stale instructions into its "pipeline" ahead of
time, and these are _not_ flushed by the broadcast maintenance instructions
in (1). SYNC_CORE provides a means for userspace to discard these stale
instructions.
3. The context synchronization event on exception entry/exit is
sufficient here. The Arm ARM isn't very good at describing what it
does, because it's in denial about the existence of a pipeline, but
it does have snippets such as:
(s/PE/CPU/)
| For all types of memory:
| The PE might have fetched the instructions from memory at any time
| since the last Context synchronization event on that PE.
Interestingly, the architecture recently added a control bit to remove
this synchronisation from exception return, so if we set that then we'd
have a problem with SYNC_CORE and adding an ISB would be necessary (and
we could probable then make kernel->kernel returns cheaper, but I
suspect we're relying on this implicit synchronisation in other places
too).
Are you seeing a problem in practice, or did this come up while trying to
decipher the semantics of SYNC_CORE?
Will
Powered by blists - more mailing lists