lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <7a4de623-ecda-4369-a7ae-0c43ef328177@zytor.com>
Date: Wed, 23 Oct 2024 14:31:51 -0700
From: "H. Peter Anvin" <hpa@...or.com>
To: Xin Li <xin3.li@...el.com>, Xin Li <xin@...or.com>,
        Andrew Cooper <andrew.cooper3@...rix.com>
Cc: "x86@...nel.org" <x86@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>
Subject: RFC, untested: handing of MSR immediates and MSRs on Xen

So the coming of WRMSRNS immediate and RDMSR immediate forms is now 
official in the latest edition (Oct 2024) of the Intel ISE document, see:

https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html

I have been thinking about how to (a) leverage these instructions to the 
best effect and (b) get rid of the code overhead associated with Xen 
paravirtualization of a handful of MSRs. As it turns out, the vast 
majority of MSRs under Xen are simply passed through anyway; a handful 
(perf related) are handled differently, and a small number are ignored.

The immediate form of these instructions are primarily motivated by 
performance, not code size: by having the MSR number in an immediate, it 
is available *much* earlier in the pipeline, which allows the hardware 
much more leeway about how a particular MSR is handled.

Furthermore, we want to continue to minimize the overhead caused by the 
remaining users of paravirtualization. The only PV platform left that 
intercepts MSRs is Xen.

So, as per previous discussions what we want to do is:

- Have Xen handled by the normal alternatives patching;
- Use an assembly wrapper around the Xen-specific code;
- Allow Xen to invoke the standard error handler by adding a new
   exception intercept type: EX_TYPE_INDIRECT. This exception type
   takes a register (i.e. _ASM_EXTABLE_TYPE_REG) and then looks up
   the exception handler at the address pointed to by that register.
   This lets the Xen assembly wrapper deal with error by:

     /* let CF be set on error here (any flag condition works) */
     jc .L_error
     ret
   .L_error:
     pop %rdx	  /* Drop return address */
     sub $5,%rdx	  /* Rewind to the beginning of CALL instruction */
     1: ud2        /* Any unconditionally trapping instruction */
     _ASM_EXTABLE_TYPE_REG(1b, 1b /* unused */, EX_TYPE_INDIRECT, %rdx)

Rather than trying to explain the whole mechanism, I'm including a 
crude-and-totally-untested concept implementation for comments and 
hopefully, eventually, productization.

Note: I haven't added tracepoint handling yet. *Ideally* tracepoints 
would be patched over the main callsite instead of using a separate 
static_key() -- which also messes up register allocation due to the 
subsequent call. This is a general problem with tracepoints which 
perhaps is better handled separately.

	-hpa


	-hpa


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ