[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <7a4de623-ecda-4369-a7ae-0c43ef328177@zytor.com>
Date: Wed, 23 Oct 2024 14:31:51 -0700
From: "H. Peter Anvin" <hpa@...or.com>
To: Xin Li <xin3.li@...el.com>, Xin Li <xin@...or.com>,
Andrew Cooper <andrew.cooper3@...rix.com>
Cc: "x86@...nel.org" <x86@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>
Subject: RFC, untested: handing of MSR immediates and MSRs on Xen
So the coming of WRMSRNS immediate and RDMSR immediate forms is now
official in the latest edition (Oct 2024) of the Intel ISE document, see:
https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html
I have been thinking about how to (a) leverage these instructions to the
best effect and (b) get rid of the code overhead associated with Xen
paravirtualization of a handful of MSRs. As it turns out, the vast
majority of MSRs under Xen are simply passed through anyway; a handful
(perf related) are handled differently, and a small number are ignored.
The immediate form of these instructions are primarily motivated by
performance, not code size: by having the MSR number in an immediate, it
is available *much* earlier in the pipeline, which allows the hardware
much more leeway about how a particular MSR is handled.
Furthermore, we want to continue to minimize the overhead caused by the
remaining users of paravirtualization. The only PV platform left that
intercepts MSRs is Xen.
So, as per previous discussions what we want to do is:
- Have Xen handled by the normal alternatives patching;
- Use an assembly wrapper around the Xen-specific code;
- Allow Xen to invoke the standard error handler by adding a new
exception intercept type: EX_TYPE_INDIRECT. This exception type
takes a register (i.e. _ASM_EXTABLE_TYPE_REG) and then looks up
the exception handler at the address pointed to by that register.
This lets the Xen assembly wrapper deal with error by:
/* let CF be set on error here (any flag condition works) */
jc .L_error
ret
.L_error:
pop %rdx /* Drop return address */
sub $5,%rdx /* Rewind to the beginning of CALL instruction */
1: ud2 /* Any unconditionally trapping instruction */
_ASM_EXTABLE_TYPE_REG(1b, 1b /* unused */, EX_TYPE_INDIRECT, %rdx)
Rather than trying to explain the whole mechanism, I'm including a
crude-and-totally-untested concept implementation for comments and
hopefully, eventually, productization.
Note: I haven't added tracepoint handling yet. *Ideally* tracepoints
would be patched over the main callsite instead of using a separate
static_key() -- which also messes up register allocation due to the
subsequent call. This is a general problem with tracepoints which
perhaps is better handled separately.
-hpa
-hpa
Powered by blists - more mailing lists