[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8735nx6s2z.ffs@tglx>
Date: Mon, 15 Nov 2021 22:12:52 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Andy Lutomirski <luto@...nel.org>,
Asit K Mallick <asit.k.mallick@...el.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"Brown, Len" <len.brown@...el.com>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
the arch/x86 maintainers <x86@...nel.org>,
Borislav Petkov <bp@...en8.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"Bae, Chang Seok" <chang.seok.bae@...el.com>,
Arjan van de Ven <arjan@...ux.intel.com>
Cc: Andrew Cooper <andrew.cooper3@...rix.com>
Subject: Re: Revisiting XFD-based AMX and heterogenous systems
Andy,
On Mon, Nov 15 2021 at 11:59, Andy Lutomirski wrote:
> So I suggest that we go back and switch to the XCR0 model. Tasks will
> start out with AMX clear in XCR0. If they want AMX, they issue a
> prctl asking for AMX, AMX gets set in XCR0, and the tasks need to be
> able to tolerate the XCR0 change.
We can do that, but that still want's XFD for avoiding allocating large
buffers for all tasks in such a process which never use that feature.
Aside of that as we all know context switching XCR0 sucks.
> Then, if Intel ever wants to expose the full Alder Lake physical
> capabilities and support efficiency cores and AVX-512 on the same
> boot, we can have a mode in which tasks start with AVX-512 clear in
> XCR0 and can opt in with prctl. This will require HPC-like apps to be
> recompiled or run with a special wrapper bit will otherwise expose the
> full HW capabilities. (Of course this assumes that Intel sets up MSRs
> or ucode or whatever to support this.)
If software needs to be recompiled or wrapped anyway then Intel can just
provide XFD support for AVX512 if it wants to expose this at runtime on
those CPUs.
As that needs to be implemented for AMX anyway the logical consequence
for user space is:
available = arch_prctl(ARCH_GET_XCOMP_SUPP); // Same as XCR0
permitted = arch_prctl(ARCH_GET_XCOMP_PERM); // XRC0 & permission bits
and work from there. If done with future XFD support for other features
than AMX in mind (even retroactively added for AVX512) then this should
be straight forward to adjust.
For the kernel adding XFD for AVX512 even conditionally based on a CPUID
bit is pretty straight forward now. It needs a small change to the way
how we distinguish XFD based and unconditional features, but that's
trivial effort compared to going for XCR0 switching with all its
downsides.
Thanks,
tglx
Powered by blists - more mailing lists