lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b2624e84-6fab-44a3-affc-ce0847cd3da4@suse.com>
Date: Tue, 22 Apr 2025 11:57:01 +0200
From: Jürgen Groß <jgross@...e.com>
To: "Xin Li (Intel)" <xin@...or.com>, linux-kernel@...r.kernel.org,
 kvm@...r.kernel.org, linux-perf-users@...r.kernel.org,
 linux-hyperv@...r.kernel.org, virtualization@...ts.linux.dev,
 linux-pm@...r.kernel.org, linux-edac@...r.kernel.org,
 xen-devel@...ts.xenproject.org, linux-acpi@...r.kernel.org,
 linux-hwmon@...r.kernel.org, netdev@...r.kernel.org,
 platform-driver-x86@...r.kernel.org
Cc: tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
 dave.hansen@...ux.intel.com, x86@...nel.org, hpa@...or.com, acme@...nel.org,
 andrew.cooper3@...rix.com, peterz@...radead.org, namhyung@...nel.org,
 mark.rutland@....com, alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
 irogers@...gle.com, adrian.hunter@...el.com, kan.liang@...ux.intel.com,
 wei.liu@...nel.org, ajay.kaher@...adcom.com,
 bcm-kernel-feedback-list@...adcom.com, tony.luck@...el.com,
 pbonzini@...hat.com, vkuznets@...hat.com, seanjc@...gle.com,
 luto@...nel.org, boris.ostrovsky@...cle.com, kys@...rosoft.com,
 haiyangz@...rosoft.com, decui@...rosoft.com
Subject: Re: [RFC PATCH v2 21/34] x86/msr: Utilize the alternatives mechanism
 to write MSR

On 22.04.25 10:22, Xin Li (Intel) wrote:
> The story started from tglx's reply in [1]:
> 
>    For actual performance relevant code the current PV ops mechanics
>    are a horrorshow when the op defaults to the native instruction.
> 
>    look at wrmsrl():
> 
>    wrmsrl(msr, val
>     wrmsr(msr, (u32)val, (u32)val >> 32))
>      paravirt_write_msr(msr, low, high)
>        PVOP_VCALL3(cpu.write_msr, msr, low, high)
> 
>    Which results in
> 
> 	mov	$msr, %edi
> 	mov	$val, %rdx
> 	mov	%edx, %esi
> 	shr	$0x20, %rdx
> 	call	native_write_msr
> 
>    and native_write_msr() does at minimum:
> 
> 	mov    %edi,%ecx
> 	mov    %esi,%eax
> 	wrmsr
> 	ret
> 
>    In the worst case 'ret' is going through the return thunk. Not to
>    talk about function prologues and whatever.
> 
>    This becomes even more silly for trivial instructions like STI/CLI
>    or in the worst case paravirt_nop().

This is nonsense.

In the non-Xen case the initial indirect call is directly replaced with
STI/CLI via alternative patching, while for Xen it is replaced by a direct
call.

The paravirt_nop() case is handled in alt_replace_call() by replacing the
indirect call with a nop in case the target of the call was paravirt_nop()
(which is in fact no_func()).

> 
>    The call makes only sense, when the native default is an actual
>    function, but for the trivial cases it's a blatant engineering
>    trainwreck.

The trivial cases are all handled as stated above: a direct replacement
instruction is placed at the indirect call position.

> Later a consensus was reached to utilize the alternatives mechanism to
> eliminate the indirect call overhead introduced by the pv_ops APIs:
> 
>      1) When built with !CONFIG_XEN_PV, X86_FEATURE_XENPV becomes a
>         disabled feature, preventing the Xen code from being built
>         and ensuring the native code is executed unconditionally.

This is the case today already. There is no need for any change to have
this in place.

> 
>      2) When built with CONFIG_XEN_PV:
> 
>         2.1) If not running on the Xen hypervisor (!X86_FEATURE_XENPV),
>              the kernel runtime binary is patched to unconditionally
>              jump to the native MSR write code.
> 
>         2.2) If running on the Xen hypervisor (X86_FEATURE_XENPV), the
>              kernel runtime binary is patched to unconditionally jump
>              to the Xen MSR write code.

I can't see what is different here compared to today's state.

> 
> The alternatives mechanism is also used to choose the new immediate
> form MSR write instruction when it's available.

Yes, this needs to be added.

> Consequently, remove the pv_ops MSR write APIs and the Xen callbacks.

I still don't see a major difference to today's solution.

Only the "paravirt" term has been eliminated.


Juergen

Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3684 bytes)

Download attachment "OpenPGP_signature.asc" of type "application/pgp-signature" (496 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ