lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 4 Jun 2024 14:37:32 +0300
From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
To: Dave Hansen <dave.hansen@...el.com>, 
	Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, Thomas Gleixner <tglx@...utronix.de>, 
	Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, x86@...nel.org, 
	"H. Peter Anvin" <hpa@...or.com>, "K. Y. Srinivasan" <kys@...rosoft.com>, 
	Haiyang Zhang <haiyangz@...rosoft.com>, Wei Liu <wei.liu@...nel.org>, Dexuan Cui <decui@...rosoft.com>, 
	Josh Poimboeuf <jpoimboe@...nel.org>, Peter Zijlstra <peterz@...radead.org>, 
	linux-coco@...ts.linux.dev, linux-kernel@...r.kernel.org, linux-hyperv@...r.kernel.org
Subject: Re: [PATCH] x86/tdx: Enhance code generation for TDCALL and SEAMCALL
 wrappers

On Mon, Jun 03, 2024 at 06:37:45AM -0700, Dave Hansen wrote:
> On 6/2/24 04:54, Kirill A. Shutemov wrote:
> > Sean observed that the compiler is generating inefficient code to clear
> > the tdx_module_args struct for TDCALL and SEAMCALL wrappers. The
> > compiler is generating numerous instructions at each call site to clear
> > the unused fields of the structure.
> > 
> > To address this issue, avoid using C99-initializer and instead
> > explicitly use string instructions to clear the struct.
> > 
> > With Clang, this change results in a savings of approximately 3K with my
> > configuration:
> > 
> > add/remove: 0/0 grow/shrink: 0/21 up/down: 0/-3187 (-3187)
> > 
> > With GCC, the savings are less significant at around 300 bytes:
> > 
> > add/remove: 0/0 grow/shrink: 3/22 up/down: 17/-313 (-296)
> > 
> > GCC tends to generate string instructions more frequently to clear the
> > struct.
> 
> <shrug>
> 
> I don't think moving away from perfectly normal C struct initialization
> is worth it for 300 bytes of text in couple of slow paths.
> 
> If someone out there is using clang, is confident that it is doing the
> right thing and not just being silly, _and_ is horribly bothered by its
> code generation, then please speak up.

Conceptually, I like my previous attempt more. But it is much more
intrusive and I am not sure it is worth the risk.

This patch feels like hack around compiler.

Sean, do you have any comments?

> > +static __always_inline void tdx_arg_init(struct tdx_module_args *args)
> > +{
> > +	asm ("rep stosb"
> > +	     : "+D" (args)
> > +	     : "c" (sizeof(*args)), "a" (0)
> > +	     : "memory");
> > +}
> 
> The inline assembly also has the side-effect of tripping up the
> compiler.  The compiler can't optimize across these at all and it
> probably has the effect of bloating the code.

It can, but it is limited. Compiler has to flush registers content back to
memory before asm() and cannot assume anything that read from memory
before the asm() is still valid after.
 
> Oh, and if we're going to leave this weirdo initialization idiom for
> TDX, it needs to be well commented:
> 
> /*
>  * Using normal " = {};" to initialize tdx_module_args results in
>  * bloated hard-to-read assembly.  Zero it using the most compact way
>  * available.
>  */
> 
> Eh?

Okay.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ