linux-kernel - Re: Candidate Linux ABI for Intel AMX and hypothetical new related features

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87o8dmmljh.ffs@nanos.tec.linutronix.de>
Date:   Fri, 07 May 2021 20:44:02 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Dave Hansen <dave.hansen@...el.com>,
        Florian Weimer <fweimer@...hat.com>,
        Len Brown <lenb@...nel.org>
Cc:     Borislav Petkov <bp@...en8.de>, Willy Tarreau <w@....eu>,
        Andy Lutomirski <luto@...nel.org>,
        "Bae\, Chang Seok" <chang.seok.bae@...el.com>,
        X86 ML <x86@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
        linux-abi@...r.kernel.org,
        "libc-alpha\@sourceware.org" <libc-alpha@...rceware.org>,
        Rich Felker <dalias@...c.org>, Kyle Huey <me@...ehuey.com>,
        Keno Fischer <keno@...iacomputing.com>
Subject: Re: Candidate Linux ABI for Intel AMX and hypothetical new related features

On Mon, May 03 2021 at 06:43, Dave Hansen wrote:
> On 5/2/21 10:18 PM, Florian Weimer wrote:
>>> 5. If the feature is enabled in XCR0, the user happily uses it.
>>>
>>>     For AMX, Linux implements "transparent first use"
>>>     so that it doesn't have to allocate 8KB context switch
>>>     buffers for tasks that don't actually use AMX.
>>>     It does this by arming XFD for all tasks, and taking a #NM
>>>     to allocate a context switch buffer only for those tasks
>>>     that actually execute AMX instructions.
>> What happens if the kernel cannot allocate that additional context
>> switch buffer?
>
> Well, it's vmalloc()'d and currently smaller that the kernel stack,
> which is also vmalloc()'d.  While it can theoretically fail, if it
> happens you have bigger problems on your hands.

Such a buffer allocation might also exceed a per process/cgroup
limitation. Anything else which is accounted happens in syscall context
which then returns an error on which the application can react.

So what's the consequence when the allocation fails? Kill it right away
from #NM? Kill it on the first signal? Do nothing and see what happens?

Thanks,

        tglx