[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <33241b25-4d45-4278-a4e6-ec9c12b0e1f3@www.fastmail.com>
Date: Thu, 17 Jun 2021 07:00:26 -0700
From: "Andy Lutomirski" <luto@...nel.org>
To: "Mark Rutland" <mark.rutland@....com>
Cc: "Russell King (Oracle)" <linux@...linux.org.uk>,
"the arch/x86 maintainers" <x86@...nel.org>,
"Dave Hansen" <dave.hansen@...el.com>,
"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
linux-mm@...ck.org, "Andrew Morton" <akpm@...ux-foundation.org>,
"Mathieu Desnoyers" <mathieu.desnoyers@...icios.com>,
"Nicholas Piggin" <npiggin@...il.com>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH 7/8] membarrier: Remove arm (32) support for SYNC_CORE
On Thu, Jun 17, 2021, at 6:51 AM, Mark Rutland wrote:
> On Thu, Jun 17, 2021 at 06:41:41AM -0700, Andy Lutomirski wrote:
> > In any event, I’m even more convinced that no new SYNC_CORE arches
> > should be added. We need a new API that just does the right thing.
>
> My intuition is the other way around, and that this is a gnereally
> useful thing for architectures that require context synchronization.
Except that you can't use it in a generic way. You have to know the specific rules for your arch.
>
> It's not clear to me what "the right thing" would mean specifically, and
> on architectures with userspace cache maintenance JITs can usually do
> the most optimal maintenance, and only need help for the context
> synchronization.
>
This I simply don't believe -- I doubt that any sane architecture really works like this. I wrote an email about it to Intel that apparently generated internal discussion but no results. Consider:
mmap(some shared library, some previously unmapped address);
this does no heavyweight synchronization, at least on x86. There is no "serializing" instruction in the fast path, and it *works* despite anything the SDM may or may not say.
We can and, IMO, should develop a sane way for user programs to install instructions into VMAs, for security-conscious software to verify them (by splitting the read and write sides?), and for their consumers to execute them, without knowing any arch details. And I think this can be done with no IPIs except for possible TLB flushing when needed, at least on most architectures. It would require a nontrivial amount of design work, and it would not resemble sys_cacheflush() or SYNC_CORE.
--Andy
Powered by blists - more mailing lists