lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cda57e5f-acf5-414c-8faa-d2496c02ced9@citrix.com>
Date: Fri, 5 Jul 2024 11:30:16 +0100
From: Andrew Cooper <andrew.cooper3@...rix.com>
To: Borislav Petkov <bp@...en8.de>, "H. Peter Anvin" <hpa@...or.com>
Cc: dave.hansen@...el.com, xin@...or.com, linux-kernel@...r.kernel.org,
 tglx@...utronix.de, mingo@...hat.com, dave.hansen@...ux.intel.com,
 x86@...nel.org, peterz@...radead.org, nik.borisov@...e.com,
 houwenlong.hwl@...group.com, Juergen Gross <jgross@...e.com>
Subject: Re: [PATCH v1 2/4] x86/fred: Write to FRED MSRs with wrmsrns()

On 05/07/2024 10:44 am, Borislav Petkov wrote:
> On Thu, Jul 04, 2024 at 07:45:00PM -0700, H. Peter Anvin wrote:
>> Except that that would be more cleanly spelled wrmsrns()... let's just
>> abstract the whole thing away and let the user use wrmsrns() whenever
>> serialization is not necessary.
> Just don't make it more complex and unreadable than it has to be.
>
> cpu_feature_enabled() already is patching things for optimal perf so even if
> PeterZ prefers the alternative, I say it is ugly and, more importantly,
> unnecessary.

You cite perf.  Look at the disassembly of the two approaches...

cpu_feature_enabled() might give you warm fuzzy feelings that you've
eekd out every ounce of performance, but it's an absolute disaster at a
code generation level by forcing the compiler to lay out both side and
preventing any kind of CSE.  As I've reported before, count the number
of RDPKRU instructions in trivial-looking xsave handling functions for a
glimpse of the practical consequences.


Anyway, none of this is the complicated aspect.  The complicated issue
is the paravirt wrmsr().

TGLX's complaint is that everyone turns on CONFIG_PARAVIRT, and the
paravirt hook for wmsr() is a code generation disaster WRT parameter
handling.  I agree that it's not great, although it's got nothing on the
damage done by cpu_feature_enabled().


But, seeing as I've got everyone's attention, I'll repeat my proposal
for fixing this nicely, in the hope of any feedback on the 3rd posting...

The underlying problem is that parameter setup for the paravirt wrmsr()
follows a C calling convention, so the index/data are manifested into
%rdi/%rsi.  Then, the out-line "native" hook shuffles the index/data
back into %ecx/%edx/%eax, and this cost is borne in all kernels.

Instead, the better way would be to have a hook with a non-standard
calling convention which happens to match the WRMSR instruction.

That way, the native, and simple paravirt paths inline to a single
instruction with no extraneous parameter shuffling, and the shuffling
cost is borne by PARAVIRT_XXL only, where a reg/reg move is nothing
compared to the hypercall involved.

The only complication is the extable #GP hook, but that's fine to place
at the paravirt site as long as the extable handler confirms the #GP
came from a WRMSR{NS,} and not a branch.

~Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ