linux-kernel - RFC, untested: handing of MSR immediates and MSRs on Xen

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <7a4de623-ecda-4369-a7ae-0c43ef328177@zytor.com>
Date: Wed, 23 Oct 2024 14:31:51 -0700
From: "H. Peter Anvin" <hpa@...or.com>
To: Xin Li <xin3.li@...el.com>, Xin Li <xin@...or.com>,
        Andrew Cooper <andrew.cooper3@...rix.com>
Cc: "x86@...nel.org" <x86@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>
Subject: RFC, untested: handing of MSR immediates and MSRs on Xen

So the coming of WRMSRNS immediate and RDMSR immediate forms is now 
official in the latest edition (Oct 2024) of the Intel ISE document, see:

https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html

I have been thinking about how to (a) leverage these instructions to the 
best effect and (b) get rid of the code overhead associated with Xen 
paravirtualization of a handful of MSRs. As it turns out, the vast 
majority of MSRs under Xen are simply passed through anyway; a handful 
(perf related) are handled differently, and a small number are ignored.

The immediate form of these instructions are primarily motivated by 
performance, not code size: by having the MSR number in an immediate, it 
is available *much* earlier in the pipeline, which allows the hardware 
much more leeway about how a particular MSR is handled.

Furthermore, we want to continue to minimize the overhead caused by the 
remaining users of paravirtualization. The only PV platform left that 
intercepts MSRs is Xen.

So, as per previous discussions what we want to do is:

- Have Xen handled by the normal alternatives patching;
- Use an assembly wrapper around the Xen-specific code;
- Allow Xen to invoke the standard error handler by adding a new
   exception intercept type: EX_TYPE_INDIRECT. This exception type
   takes a register (i.e. _ASM_EXTABLE_TYPE_REG) and then looks up
   the exception handler at the address pointed to by that register.
   This lets the Xen assembly wrapper deal with error by:

     /* let CF be set on error here (any flag condition works) */
     jc .L_error
     ret
   .L_error:
     pop %rdx	  /* Drop return address */
     sub $5,%rdx	  /* Rewind to the beginning of CALL instruction */
     1: ud2        /* Any unconditionally trapping instruction */
     _ASM_EXTABLE_TYPE_REG(1b, 1b /* unused */, EX_TYPE_INDIRECT, %rdx)

Rather than trying to explain the whole mechanism, I'm including a 
crude-and-totally-untested concept implementation for comments and 
hopefully, eventually, productization.

Note: I haven't added tracepoint handling yet. *Ideally* tracepoints 
would be patched over the main callsite instead of using a separate 
static_key() -- which also messes up register allocation due to the 
subsequent call. This is a general problem with tracepoints which 
perhaps is better handled separately.

	-hpa

	-hpa