linux-kernel - Re: [RFC] x86: Speculative execution warnings

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <3944C0B1-D0C4-4D2F-B055-69313CFD73F2@amacapital.net>
Date:   Tue, 14 May 2019 10:15:21 -0700
From:   Andy Lutomirski <luto@...capital.net>
To:     Nadav Amit <namit@...are.com>
Cc:     Paul Turner <pjt@...gle.com>,
        the arch/x86 maintainers <x86@...nel.org>,
        Borislav Petkov <bp@...en8.de>,
        LKML <linux-kernel@...r.kernel.org>,
        Andy Lutomirsky <luto@...nel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Jann Horn <jannh@...gle.com>
Subject: Re: [RFC] x86: Speculative execution warnings



On May 14, 2019, at 10:00 AM, Nadav Amit <namit@...are.com> wrote:

>> On May 14, 2019, at 1:00 AM, Paul Turner <pjt@...gle.com> wrote:
>> 
>> From: Nadav Amit <namit@...are.com>
>> Date: Fri, May 10, 2019 at 7:45 PM
>> To: <x86@...nel.org>
>> Cc: Borislav Petkov, <linux-kernel@...r.kernel.org>, Nadav Amit, Andy
>> Lutomirsky, Ingo Molnar, Peter Zijlstra, Thomas Gleixner, Jann Horn
>> 
>>> It may be useful to check in runtime whether certain assertions are
>>> violated even during speculative execution. This can allow to avoid
>>> adding unnecessary memory fences and at the same time check that no data
>>> leak channels exist.
>>> 
>>> For example, adding such checks can show that allocating zeroed pages
>>> can return speculatively non-zeroed pages (the first qword is not
>>> zero).  [This might be a problem when the page-fault handler performs
>>> software page-walk, for example.]
>>> 
>>> Introduce SPEC_WARN_ON(), which checks in runtime whether a certain
>>> condition is violated during speculative execution. The condition should
>>> be computed without branches, e.g., using bitwise operators. The check
>>> will wait for the condition to be realized (i.e., not speculated), and
>>> if the assertion is violated, a warning will be thrown.
>>> 
>>> Warnings can be provided in one of two modes: precise and imprecise.
>>> Both mode are not perfect. The precise mode does not always make it easy
>>> to understand which assertion was broken, but instead points to a point
>>> in the execution somewhere around the point in which the assertion was
>>> violated.  In addition, it prints a warning for each violation (unlike
>>> WARN_ONCE() like behavior).
>>> 
>>> The imprecise mode, on the other hand, can sometimes throw the wrong
>>> indication, specifically if the control flow has changed between the
>>> speculative execution and the actual one. Note that it is not a
>>> false-positive, it just means that the output would mislead the user to
>>> think the wrong assertion was broken.
>>> 
>>> There are some more limitations. Since the mechanism requires an
>>> indirect branch, it should not be used in production systems that are
>>> susceptible for Spectre v2. The mechanism requires TSX and performance
>>> counters that are only available in skylake+. There is a hidden
>>> assumption that TSX is not used in the kernel for anything else, other
>>> than this mechanism.
>> 
>> Nice trick!
> 
> “Illusion." [ ignore if you don’t know the reference ]
> 
>> 
>> Can you eliminate the indirect call by forcing an access fault to
>> abort the transaction instead, e.g. "cmove 0, $1”?
>> 
>> (If this works, it may also allow support on older architectures as
>> the RTM_RETIRED.ABORT* events go back further I believe?)
> 
> I don’t think it would work. The whole problem is that we need a counter
> that is updated during execution and not retirement. I tried several
> counters and could not find other appropriate ones.
> 
> The idea behind the implementation is to affect the control flow through
> data dependency. I may be able to do something similar without an indirect
> branch. I’ll take a page, put the XABORT on the page and make the page NX.
> Then, a direct jump would go to this page. The conditional-mov would change
> the PTE to X if the assertion is violated. There should be a page-walk even
> if the CPU finds the entry in the TLB, since this entry is NX.
> 

I think you only get a page walk if the TLB entry is not-present.  I’d be a bit surprised if the CPU is willing to execute, even speculatively, from speculatively written data. Good luck!