lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 24 May 2017 21:19:03 +0200
From:   Roman Penyaev <roman.penyaev@...fitbricks.com>
To:     Andy Lutomirski <luto@...nel.org>
Cc:     Mikhail Sennikovskii <mikhail.sennikovskii@...fitbricks.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Gleb Natapov <gleb@...nel.org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, Borislav Petkov <bp@...en8.de>,
        Radim Krčmář <rkrcmar@...hat.com>
Subject: Re: [RFC] KVM: SVM: do not drop VMCB CPL to 0 if SS is not present

On Sun, May 21, 2017 at 10:19 PM, Andy Lutomirski <luto@...nel.org> wrote:
>>
>>>
>>> Unless... is this the sysret_ss_attrs issue?
>>
>>
>> What is the issue?  This one
>>
>> https://lkml.org/lkml/2015/4/24/770
>
>
> Yes.
>
> But I was thinking about it wrong, since this is probably 64-bit userspace,

sorry, I forgot to mention that userspace is indeed 64-bit.

> not 32-bit userspace.  Here's my theory:
>
> 1. User task A does a syscall.  It's not in kernel mode with SS != 0.
>
> 2. The scheduler runs and switches to task B.  SS != 0.
>
> 2. Kernel enters user mode for task B.
>
> 3. User task B gets interrupted.  Kernel ends up running with SS = 0.
>
> 4. Kernel switches back to task A.  SS == 0.
>
> 5. Kernel does SYSRET.  SS == __USER_DS, but SS's attributes are messed up.
>
> 6. QEMU does whatever it does that inspires it to zap SS's attributes.
>
> 7. Boom.
>
> If task B were 32-bit, then the vDSO would fix up SS, so there would only be
> a 1-instruction window for problems.
>
> To check this theory, you could try backporting this to the guest and seeing
> if the problem goes away:
>
> commit 61f01dd941ba9e06d2bf05994450ecc3d61b6b8b
> Author: Andy Lutomirski <luto@...nel.org>
> Date:   Sun Apr 26 16:47:59 2015 -0700
>
>     x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue


Yes, that is exactly what is happening.  I 've backported your patch on 3.16.
That explains everything.  Why bug is not reproduced on >= 4.1 guest kernels
and why we fall out from VMRUN with SS.attributes == 0x400, i.e. P bit is
not set (because of "AMD CPUs have a misfeature").


>>> Looks like the bug is in QEMU, then, right?
>>
>>
>> KVM SVM restores CPL from unusable selector, obviously this is not nice.
>
>
> I would imagine that QEMU shouldn't be feeding KVM such a selector. Also,
> there's an invariant that SS.DPL == CPL, at least most of the time, although
> this SYSRET issue may be the exception.
>
> Paolo, what's the intended behavior here?  Is the bug in KVM or in QEMU?

So, along with Andrew's workaround for the kernel, it seems that virtualization
side should be fixed accordingly to workaround AMD behaviour.

Guys, any ping?

--
Roman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ