linux-kernel - Re: [PATCH 2/4] mm/tlb: Remove tlb_remove

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180824084259.GJ24124@hirez.programming.kicks-ass.net>
Date:   Fri, 24 Aug 2018 10:42:59 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Nick Piggin <npiggin@...il.com>,
        Andrew Lutomirski <luto@...nel.org>,
        the arch/x86 maintainers <x86@...nel.org>,
        Borislav Petkov <bp@...en8.de>,
        Will Deacon <will.deacon@....com>,
        Rik van Riel <riel@...riel.com>,
        Jann Horn <jannh@...gle.com>,
        Adin Scannell <ascannell@...gle.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-mm <linux-mm@...ck.org>,
        David Miller <davem@...emloft.net>,
        Martin Schwidefsky <schwidefsky@...ibm.com>,
        Michael Ellerman <mpe@...erman.id.au>
Subject: Re: [PATCH 2/4] mm/tlb: Remove tlb_remove_table() non-concurrent
 condition

On Wed, Aug 22, 2018 at 09:54:48PM -0700, Linus Torvalds wrote:

> It honored it for the *normal* case, which is why it took so long to
> notice that the TLB shootdown had been broken on x86 when it moved to
> the "generic" code. The *normal* case does this all right, and batches
> things up, and then when the batch fills up it does a
> tlb_table_flush() which does the TLB flush and schedules the actual
> freeing.
> 
> But there were two cases that *didn't* do that. The special "I'm the
> only thread" fast case, and the "oops I ran out of memory, so now I'll
> fake it, and just synchronize with page twalkers manually, and then do
> that special direct remove without flushing the tlb".

The actual RCU batching case was also busted; there was no guarantee
that by the time we run the RCU callbacks the invalidate would've
happened. Exceedingly unlikely, but no guarantee.

So really, all 3 cases in tlb_remove_table() were busted in this
respect.