linux-kernel - Re: Current mainline git (24e700e291d52bd2) hangs when building e.g. perf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFzxsKzuYKMCXQA+54GUhRzEMCtwYWdj+RTDv7LHUhVj4Q@mail.gmail.com>
Date:   Sat, 9 Sep 2017 11:47:33 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Borislav Petkov <bp@...en8.de>
Cc:     Markus Trippelsdorf <markus@...ppelsdorf.de>,
        Andy Lutomirski <luto@...nel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Tom Lendacky <thomas.lendacky@....com>
Subject: Re: Current mainline git (24e700e291d52bd2) hangs when building e.g. perf

On Sat, Sep 9, 2017 at 11:29 AM, Borislav Petkov <bp@...en8.de> wrote:
> On Sat, Sep 09, 2017 at 11:26:27AM -0700, Linus Torvalds wrote:
>> But the fact that that fixes it for you does indicate that it's not
>> just a stale TLB entry or something, it really is some CPU using page
>> tables after they have been free'd and been re-allocated to something
>> else (and *then* they may point to garbage).
>
> Cool, I was trying to think of a good use case how we'd hit that. I
> guess you just gave one. :)

The thing is, even with the delayed TLB flushing, I don't think it
should be *so* delayed that we should be seeing a TLB fill from
garbage page tables.

But the part in Andy's patch that worries me the most is that

+               cpumask_clear_cpu(cpu, mm_cpumask(mm));

in enter_lazy_tlb(). It means that we won't be notified by peopel
invalidating the page tables, and while we then do re-validate the TLB
when we switch back from lazy mode, I still worry. I'm not entirely
convinced by that tlb_gen logic.

I can't actually see anything *wrong* in the tlb_gen logic, but it worries me.

            Linus