linux-kernel - Re: [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53582853.30009@gmail.com>
Date:	Wed, 23 Apr 2014 23:53:39 +0300
From:	Nadav Amit <nadav.amit@...il.com>
To:	Marcelo Tosatti <mtosatti@...hat.com>
CC:	Gleb Natapov <gleb@...nel.org>,
	Nadav Amit <namit@...technion.ac.il>, pbonzini@...hat.com,
	tglx@...utronix.de, mingo@...hat.com, hpa@...or.com,
	x86@...nel.org, kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected
 by string ops

On 4/23/14, 11:11 PM, Marcelo Tosatti wrote:
> On Wed, Apr 23, 2014 at 04:58:32PM -0300, Marcelo Tosatti wrote:
>> On Tue, Apr 22, 2014 at 09:04:45AM +0300, Nadav Amit wrote:
>>> Gleb,
>>>
>>> On 4/20/14, 12:26 PM, Gleb Natapov wrote:
>>>> On Fri, Apr 18, 2014 at 07:11:33AM +0300, Nadav Amit wrote:
>>>>> When using address-size override prefix with string instructions in long-mode,
>>>>> ESI/EDI/ECX are zero extended if they are affected by the instruction
>>>>> (incremented/decremented).  Currently, the KVM emulator does not do so.
>>>>>
>>>>> In addition, although it is not well-documented, when address override prefix
>>>>> is used with REP-string instruction, RCX high half is zeroed even if ECX was
>>>>> zero on the first iteration. Therefore, the emulator should clear the upper
>>>>> part of RCX in this case, as x86 CPUs do.
>>>>>
>>>>> Signed-off-by: Nadav Amit <namit@...technion.ac.il>
>>>>> ---
>>>>> :100644 100644 69e2636... a69ed67... M	arch/x86/kvm/emulate.c
>>>>>   arch/x86/kvm/emulate.c |    4 ++++
>>>>>   1 file changed, 4 insertions(+)
>>>>>
>>>>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>>>>> index 69e2636..a69ed67 100644
>>>>> --- a/arch/x86/kvm/emulate.c
>>>>> +++ b/arch/x86/kvm/emulate.c
>>>>> @@ -491,6 +491,8 @@ register_address_increment(struct x86_emulate_ctxt *ctxt, unsigned long *reg, in
>>>>>   	else
>>>>>   		mask = ad_mask(ctxt);
>>>>>   	masked_increment(reg, mask, inc);
>>>>> +	if (ctxt->ad_bytes == 4)
>>>>> +		*reg &= 0xffffffff;
>>>> *reg=(u32)*reg; and you can do it inside else part.
>>>>
>>>> register_address_increment() is used also by jmp_rel and loop instructions,
>>>> is this correct for both of those too? Probably yes.
>>>>
>>> It appears to be so.
>>> Results of 32-bit operations are implicitly zero extended to 64-bit
>>> values, and this appears to apply to all 32 bit operations,
>>> including implicit ones. Therefore it seems to apply to all these
>>> operations.
>>>
>>>>>   }
>>>>>
>>>>>   static void rsp_increment(struct x86_emulate_ctxt *ctxt, int inc)
>>>>> @@ -4567,6 +4569,8 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
>>>>>   	if (ctxt->rep_prefix && (ctxt->d & String)) {
>>>>>   		/* All REP prefixes have the same first termination condition */
>>>>>   		if (address_mask(ctxt, reg_read(ctxt, VCPU_REGS_RCX)) == 0) {
>>>>> +			if (ctxt->ad_bytes == 4)
>>>>> +				*reg_write(ctxt, VCPU_REGS_RCX) = 0;
>>>> Does zero extension happens even if ECX was zero at the beginning on an instruction or only during
>>>> ECX modification. If later it is already covered in register_address_increment, no?
>>> The observed behaviour of the Sandy-Bridge I use, is that even if
>>> ECX is zero on the first iteration, the high half of RCX is zeroed.
>>> Therefore, this is a different case, which was not covered in
>>> register_address_increment. I agree it is totally undocumented.
>>> Following your previous comment - I may have missed the case in
>>> which loop instruction is executed with ECX = 0 while RCX != 0 and
>>> the address size is 32 bit. I will test this case soon (yet, it is
>>> lower on my priority list).
>>
>> In 64-bit mode, the operand size for all near branches (CALL, RET, JCC,
>> JCXZ, JMP, and LOOP) is forced to 64 bits.
>>
>> These instructions update the 64-bit RIP without the need for a REX
>> operand-size prefix.
>>
>> The following aspects of near branches are controlled by the effective
>> operand size:
>> • Truncation of the size of the instruction pointer
>> ...
>>
>> In 64-bit mode, all of the above actions are forced to 64 bits
>> regardless of operand size prefixes (operand size
>> prefixes are silently ignored). However, the displacement field for
>> relative branches is still limited to 32 bits and the
>> address size for near branches is not forced in 64-bit mode.
>> Address sizes affect the size of RCX used for JCXZ and LOOP; they also
>> impact the address calculation for memory
>> indirect branches. Such addresses are 64 bits by default; but they can
>> be overridden to 32 bits by an address size
>> prefix.
>>
>> So it seems your patch incorrectly handles "rex call" for example.
>
> Err, operand size is forced to 64-bits, not address size.
>
> "The following aspects of near branches are controlled by the effective
> operand size:
>   • Truncation of the size of the instruction pointer"
>
> Still, "67h call" should not truncate EIP (which your patch does).
>
Yes, I missed it.
But if I am not mistaken again, it means that the existing 
implementation of jmp_rel is broken as well when address-size override 
prefix is used. In this case, as I see it, the existing masking would 
cause the carry from the add operation to the lower half of the rip not 
to be added to the rip higher half.

I guess another patch is needed for that as well.

Nadav
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/