linux-kernel - Re: [PATCH v2] x86/nmi: Fix some races in NMI uaccess

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrXjH175OLPRCwA8=Stbq1KmJwwvkZ36Y74=hF_i8GFWCw@mail.gmail.com>
Date:   Wed, 29 Aug 2018 08:36:01 -0700
From:   Andy Lutomirski <luto@...nel.org>
To:     Rik van Riel <riel@...riel.com>
Cc:     Andy Lutomirski <luto@...nel.org>, X86 ML <x86@...nel.org>,
        Borislav Petkov <bp@...en8.de>, Jann Horn <jannh@...gle.com>,
        LKML <linux-kernel@...r.kernel.org>,
        stable <stable@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Nadav Amit <nadav.amit@...il.com>
Subject: Re: [PATCH v2] x86/nmi: Fix some races in NMI uaccess

On Wed, Aug 29, 2018 at 8:17 AM, Rik van Riel <riel@...riel.com> wrote:
> On Tue, 2018-08-28 at 20:46 -0700, Andy Lutomirski wrote:
>> On Tue, Aug 28, 2018 at 10:56 AM, Rik van Riel <riel@...riel.com>
>> wrote:
>> > On Mon, 27 Aug 2018 16:04:16 -0700
>> > Andy Lutomirski <luto@...nel.org> wrote:
>> >
>> > > The 0day bot is still chewing on this, but I've tested it a bit
>> > > locally
>> > > and it seems to do the right thing.
>> >
>> > Hi Andy,
>> >
>> > the version of the patch below should fix the bug we talked about
>> > in email yesterday.  It should automatically cover kernel threads
>> > in lazy TLB mode, because current->mm will be NULL, while the
>> > cpu_tlbstate.loaded_mm should never be NULL.
>> >
>>
>> That's better than mine.  I tweaked it a bit and added some
>> debugging,
>> and I got this:
>>
>>
> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/fixes&id=dd956eba16646fd0b15c3c0741269dfd84452dac
>>
>> I made the loaded_mm handling a little more conservative to make it
>> more obvious that switch_mm_irqs_off() is safe regardless of exactly
>> when it gets called relative to switching current.
>
> I am not convinced that the dance of writing
> cpu_tlbstate.loaded_mm twice, with a barrier on
> each end, is useful or necessary.
>
> At the time switch_mm_irqs_off returns, nmi_uaccess_ok()
> will still return false, because we have not switched
> "current" to the task that owns the next mm_struct yet.
>
> We just have to make sure to:
> 1) Change cpu_tlbstate.loaded_mm before we manipulate
>    CR3, and
> 2) Change "current" only once enough of the mm stuff has
>    been switched, __switch_to seems to get that right.
>
> Between the time switch_mm_irqs_off() sets cpu_tlbstate
> to the next mm, and __switch_to moves() over current,
> nmi_uaccess_ok() will return false.

All true, but I think it stops working as soon as someone starts
calling switch_mm_irqs_off() for some other reason, such as during
text_poke().  And that was the original motivation for this patch.