linux-kernel - Re: [PATCH v9 12/26] x86/fpu/xstate: Use feature disable (XFD) to protect dynamic user state

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJvTdKn09GiAOgdsOR-+ooEO=bmj8VDL9e9sSAsu2UPx73a-Mw@mail.gmail.com>
Date:   Tue, 24 Aug 2021 19:17:49 -0400
From:   Len Brown <lenb@...nel.org>
To:     Borislav Petkov <bp@...en8.de>
Cc:     "Chang S. Bae" <chang.seok.bae@...el.com>,
        Andy Lutomirski <luto@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>, X86 ML <x86@...nel.org>,
        "Brown, Len" <len.brown@...el.com>,
        Dave Hansen <dave.hansen@...el.com>, thiago.macieira@...el.com,
        "Liu, Jing2" <jing2.liu@...el.com>,
        "Ravi V. Shankar" <ravi.v.shankar@...el.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v9 12/26] x86/fpu/xstate: Use feature disable (XFD) to
 protect dynamic user state

On Wed, Aug 18, 2021 at 12:24 PM Borislav Petkov <bp@...en8.de> wrote:

> Why isn't the buffer pre-allocated for the process after latter having
> done prctl() so that when an #NM happens, no allocation happens at all?

The problem with a system call to pre-allocate an AMX context switch buffer
is that it doesn't actually deliver on the goal of guaranteeing no subsequent
run-time failures due to OOM.

Even if your AMX thread pool threads were to invoke this system call
as soon as possible...
What is to say that the thread pool is created only at a time when memory
is available?  A thread could be created 24 hours into program execution
under OOM conditions and this system call will return ENOMEM, and your program
will in all likelihood throw up its arms and exit at the exact same place
it would exit for transparently allocated buffers.

What if you don't care about 24-hours in, and you do care about
allocating at program start?
The program can equally cause the kernel to allocate an AMX context switch
buffer by simply touching a TMM register -- and this can be done at exactly the
same place in the program that calling a pre-allocate system call.

The difference in these two methods is that the system call returns a synchronus
ENOMEM, while the touching of a TMM register sends you a signal at
that location.
In theory, a program may have a thoughtfully implemented and thoroughly tested
*else* clause for ENOMEM -- but you and I know that is a fantasy --
they will exit anyway.

The advantage of the #NM over the syscall is that  the programmer doesn't
actually have to do anything.  Also, transparently allocated buffers offer
a theoretical benefit that a program may have many threads, but only a few
may actually touch AMX, and so there is savings to be had by allocating buffers
only for the threads that actually use the buffers.

Finally, the XFD/NM mechanism opens the door in the future for the
kernel to actually
harvest allocated, but unused buffers -- though we didn't bother
implementing that for AMX.

> And with those buffers preallocated, all that XFD muck is not needed either.

Independent of context switch buffer allocation...

XFD is used to *enforce* that AMX is not used without permission.
Were we to not use the XFD feature, users would be able to stash
data in TMM registers and even use TMUL without the kernel
being able to prevent them from doing so.  Then when they
context switch or take a signal, the data in their TMM registers
would mysteriously vanish...

Much better to be able to tell them immediately that they are doing it wrong...

-- 
Len Brown, Intel Open Source Technology Center