lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 30 Dec 2020 12:33:02 +1000
From:   Nicholas Piggin <>
To:     Russell King - ARM Linux admin <>
Cc:     Arnd Bergmann <>,
        Benjamin Herrenschmidt <>,
        Catalin Marinas <>,
        Jann Horn <>,
        linux-arm-kernel <>,
        linux-kernel <>,
        linuxppc-dev <>,
        Andy Lutomirski <>,
        Mathieu Desnoyers <>,
        Michael Ellerman <>,
        paulmck <>, Paul Mackerras <>,
        Peter Zijlstra <>,
        stable <>, Will Deacon <>,
        x86 <>
Subject: Re: [RFC please help] membarrier: Rewrite sync_core_before_usermode()

Excerpts from Russell King - ARM Linux admin's message of December 29, 2020 8:44 pm:
> On Tue, Dec 29, 2020 at 01:09:12PM +1000, Nicholas Piggin wrote:
>> I think it should certainly be documented in terms of what guarantees
>> it provides to application, _not_ the kinds of instructions it may or
>> may not induce the core to execute. And if existing API can't be
>> re-documented sanely, then deprecatd and new ones added that DTRT.
>> Possibly under a new system call, if arch's like ARM want a range
>> flush and we don't want to expand the multiplexing behaviour of
>> membarrier even more (sigh).
> The 32-bit ARM sys_cacheflush() is there only to support self-modifying
> code, and takes whatever actions are necessary to support that.
> Exactly what actions it takes are cache implementation specific, and
> should be of no concern to the caller, but the underlying thing is...
> it's to support self-modifying code.

       cacheflush()  should  not  be used in programs intended to be portable.
       On Linux, this call first appeared on the MIPS architecture, but  nowa‐
       days, Linux provides a cacheflush() system call on some other architec‐
       tures, but with different arguments.

What a disaster. Another badly designed interface, although it didn't 
originate in Linux it sounds like we weren't to be outdone so
we messed it up even worse.

flushing caches is neither necessary nor sufficient for code modification
on many processors. Maybe some old MIPS specific private thing was fine,
but certainly before it grew to other architectures, somebody should 
have thought for more than 2 minutes about it. Sigh.

> Sadly, because it's existed for 20+ years, and it has historically been
> sufficient for other purposes too, it has seen quite a bit of abuse
> despite its design purpose not changing - it's been used by graphics
> drivers for example. They quickly learnt the error of their ways with
> ARMv6+, since it does not do sufficient for their purposes given the
> cache architectures found there.
> Let's not go around redesigning this after twenty odd years, requiring
> a hell of a lot of pain to users. This interface is called by code
> generated by GCC, so to change it you're looking at patching GCC as
> well as the kernel, and you basically will make new programs
> incompatible with older kernels - very bad news for users.

For something to be redesigned it had to have been designed in the first 
place, so there is no danger of that don't worry... But no I never 
suggested making incompatible changes to any existing system call, I 
said "re-documented". And yes I said deprecated but in Linux that really 
means kept indefinitely.

If ARM, MIPS, 68k etc programs and toolchains keep using what they are 
using it'll keep working no problem.

The point is we're growing new interfaces, and making the same mistakes. 
It's not portable (ARCH_HAS_MEMBARRIER_SYNC_CORE), it's also specified 
in terms of low level processor operations rather than higher level 
intent, and also is not sufficient for self-modifying code (without 
additional cache flush on some processors).

The application wants a call that says something like "memory modified 
before the call will be visible as instructions (including illegal 
instructions) by all threads in the program after the system call 
returns, and no threads will be subject to any effects of executing the 
previous contents of that memory.

So I think the basics are simple (although should confirm with some JIT 
and debugger etc developers, and not just Android mind you). There are 
some complications in details, address ranges, virtual/physical, thread 
local vs process vs different process or system-wide, memory ordering 
and propagation of i and d sides, etc. But that can be worked through, 
erring on the side of sanity rather than pointless micro-optmisations.


Powered by blists - more mailing lists