[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230713112215.2577442-1-andrew.cooper3@citrix.com>
Date: Thu, 13 Jul 2023 12:22:15 +0100
From: Andrew Cooper <andrew.cooper3@...rix.com>
To: <kai.huang@...el.com>
CC: <bp@...en8.de>, <dave.hansen@...el.com>, <hpa@...or.com>,
<isaku.yamahata@...el.com>, <kirill.shutemov@...ux.intel.com>,
<kvm@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<mingo@...hat.com>, <pbonzini@...hat.com>, <peterz@...radead.org>,
<sathyanarayanan.kuppuswamy@...ux.intel.com>, <seanjc@...gle.com>,
<tglx@...utronix.de>, <x86@...nel.org>
Subject: Re: [PATCH 07/10] x86/tdx: Extend TDX_MODULE_CALL to support more TDCALL/SEAMCALL leafs
On Thu, 13 Jul 2023 10:47:44 +0000, Huang, Kai wrote:
> On Thu, 2023-07-13 at 12:37 +0200, Peter Zijlstra wrote:
> > On Thu, Jul 13, 2023 at 10:19:49AM +0000, Huang, Kai wrote:
> > > On Thu, 2023-07-13 at 10:43 +0200, Peter Zijlstra wrote:
> > > > On Thu, Jul 13, 2023 at 08:02:54AM +0000, Huang, Kai wrote:
> > > >
> > > > > Sorry I am ignorant here. Won't "clearing ECX only" leave high bits of
> > > > > registers still containing guest's value?
> > > >
> > > > architecture zero-extends 32bit stores
> > >
> > > Sorry, where can I find this information? Looking at SDM I couldn't find :-(
> >
> > Yeah, I couldn't find it in a hurry either, but bpetkov pasted me this
> > from the AMD document:
> >
> > "In 64-bit mode, the following general rules apply to instructions and their operands:
> > “Promoted to 64 Bit”: If an instruction’s operand size (16-bit or 32-bit) in legacy and
> > compatibility modes depends on the CS.D bit and the operand-size override prefix, then the
> > operand-size choices in 64-bit mode are extended from 16-bit and 32-bit to include 64 bits (with a
> > REX prefix), or the operand size is fixed at 64 bits. Such instructions are said to be “Promoted to
> > 64 bits” in Table B-1. However, byte-operand opcodes of such instructions are not promoted."
> >
> > > I _think_ I understand now? In 64-bit mode
> > >
> > > xor %eax, %eax
> > >
> > > equals to
> > >
> > > xor %rax, %rax
> > >
> > > (due to "architecture zero-extends 32bit stores")
> > >
> > > Thus using the former (plus using "d" for %r*) can save some memory?
> >
> > Yes, 64bit wide instruction get a REX prefix 0x4X (somehow I keep typing
> > RAX) byte in front to tell it's a 64bit wide op.
> >
> > 31 c0 xor %eax,%eax
> > 48 31 c0 xor %rax,%rax
> >
> > The REX byte will show up for rN usage, because then we need the actual
> > Register Extention part of that prefix irrespective of the width.
> >
> > 45 31 d2 xor %r10d,%r10d
> > 4d 31 d2 xor %r10,%r10
> >
> > x86 instruction encoding is 'fun' :-)
> >
> > See SDM Vol 2 2.2.1.2 if you want to know more about the REX prefix.
>
> Learned something new. Appreciate your time! :-)
And now for the extra fun...
The Silvermont uarch is 64bit, but only recognises 32bit XORs as zeroing
idioms.
So for best performance on as many uarches as possible, you should *always*
use the 32bit forms, even for %r8-15.
~Andrew
Powered by blists - more mailing lists