lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 06 Dec 2020 09:14:57 +1000
From:   Nicholas Piggin <>
To:     Andy Lutomirski <>
Cc:     Anton Blanchard <>, Arnd Bergmann <>,
        linux-arch <>,
        LKML <>,
        Linux-MM <>,
        linuxppc-dev <>,
        Andy Lutomirski <>,
        Mathieu Desnoyers <>,
        Peter Zijlstra <>, X86 ML <>
Subject: Re: [PATCH 2/8] x86: use exit_lazy_tlb rather than

Excerpts from Andy Lutomirski's message of December 6, 2020 2:11 am:
>> On Dec 5, 2020, at 12:00 AM, Nicholas Piggin <> wrote:
>> I disagree. Until now nobody following it noticed that the mm gets
>> un-lazied in other cases, because that was not too clear from the
>> code (only indirectly using non-standard terminology in the arch
>> support document).
>> In other words, membarrier needs a special sync to deal with the case 
>> when a kthread takes the mm.
> I don’t think this is actually true. Somehow the x86 oddities about 
> CR3 writes leaked too much into the membarrier core code and comments. 
> (I doubt this is x86 specific.  The actual x86 specific part seems to 
> be that we can return to user mode without syncing the instruction 
> stream.)
> As far as I can tell, membarrier doesn’t care at all about laziness. 
> Membarrier cares about rq->curr->mm.  The fact that a cpu can switch 
> its actual loaded mm without scheduling at all (on x86 at least) is 
> entirely beside the point except insofar as it has an effect on 
> whether a subsequent switch_mm() call serializes.

Core membarrier itself doesn't care about laziness, which is why the
membarrier flush should go in exit_lazy_tlb() or other x86 specific
code (at least until more architectures did the same thing and we moved
it into generic code). I just meant this non-serialising return as 
documented in the membarrier arch enablement doc specifies the lazy tlb

If an mm was lazy tlb for a kernel thread and then it becomes unlazy,
and if switch_mm is serialising but return to user is not, then you
need a serialising instruction somewhere before return to user. unlazy
is the logical place to add that, because the lazy tlb mm (i.e., 
switching to a kernel thread and back without switching mm) is what 
opens the hole.

> If we notify 
> membarrier about x86’s asynchronous CR3 writes, then membarrier needs 
> to understand what to do with them, which results in an unmaintainable 
> mess in membarrier *and* in the x86 code.

How do you mean? exit_lazy_tlb is the opposite, core scheduler notifying
arch code about when an mm becomes not-lazy, and nothing to do with
membarrier at all even. It's a convenient hook to do your un-lazying.
I guess you can do it also checking things in switch_mm and keeping state
in arch code, I don't think that's necessarily the best place to put it.

So membarrier code is unchanged (it cares that the serialise is done at
un-lazy time), core code is simpler (no knowledge of this membarrier 
quirk and it already knows about lazy-tlb so the calls actually improve 
the documentation), and x86 code I would argue becomes nicer (or no real
difference at worst) because you can move some exit lazy tlb handling to
that specific call rather than decipher it from switch_mm.

> I’m currently trying to document how membarrier actually works, and 
> hopefully this will result in untangling membarrier from mmdrop() and 
> such.

That would be nice.

> A silly part of this is that x86 already has a high quality 
> implementation of most of membarrier(): flush_tlb_mm().  If you flush 
> an mm’s TLB, we carefully propagate the flush to all threads, with 
> attention to memory ordering.  We can’t use this directly as an 
> arch-specific implementation of membarrier because it has the annoying 
> side affect of flushing the TLB and because upcoming hardware might be 
> able to flush without guaranteeing a core sync.  (Upcoming means Zen 
> 3, but the Zen 3 implementation is sadly not usable by Linux.)

A hardware broadcast TLB flush, you mean? What makes it unusable by 
Linux out of curiosity?

Powered by blists - more mailing lists