lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 7 Jun 2022 15:04:27 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     Peter Xu <peterx@...hat.com>, Sasha Levin <sashal@...nel.org>,
        linux-kernel@...r.kernel.org, stable@...r.kernel.org,
        Leonardo Bras <leobras@...hat.com>, tglx@...utronix.de,
        mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
        x86@...nel.org, chang.seok.bae@...el.com, luto@...nel.org,
        kvm@...r.kernel.org
Subject: Re: [PATCH AUTOSEL 5.16 07/28] x86/kvm/fpu: Limit guest
 user_xfeatures to supported bits of XCR0

On Tue, Jun 07, 2022, Paolo Bonzini wrote:
> On 6/6/22 23:27, Peter Xu wrote:
> > On Mon, Jun 06, 2022 at 06:18:12PM +0200, Paolo Bonzini wrote:
> > > > However there seems to be something missing at least to me, on why it'll
> > > > fail a migration from 5.15 (without this patch) to 5.18 (with this patch).
> > > > In my test case, user_xfeatures will be 0x7 (FP|SSE|YMM) if without this
> > > > patch, but 0x0 if with it.
> > > 
> > > What CPU model are you using for the VM?
> > 
> > I didn't specify it, assuming it's qemu64 with no extra parameters.
> 
> Ok, so indeed it lacks AVX and this patch can have an effect.
> 
> > > For example, if the source lacks this patch but the destination has it,
> > > the source will transmit YMM registers, but the destination will fail to
> > > set them if they are not available for the selected CPU model.
> > > 
> > > See the commit message: "As a bonus, it will also fail if userspace tries to
> > > set fpu features (with the KVM_SET_XSAVE ioctl) that are not compatible to
> > > the guest configuration.  Such features will never be returned by
> > > KVM_GET_XSAVE or KVM_GET_XSAVE2."
> > 
> > IIUC you meant we should have failed KVM_SET_XSAVE when they're not aligned
> > (probably by failing validate_user_xstate_header when checking against the
> > user_xfeatures on dest host). But that's probably not my case, because here
> > KVM_SET_XSAVE succeeded, it's just that the guest gets a double fault after
> > the precopy migration completes (or for postcopy when the switchover is
> > done).
> 
> Difficult to say what's happening without seeing at least the guest code
> around the double fault (above you said "fail a migration" and I thought
> that was a different scenario than the double fault), and possibly which was
> the first exception that contributed to the double fault.

Regardless of why the guest explodes in the way it does, is someone planning on
bisecting this (if necessary?) and sending a backport to v5.15?  There's another
bug report that is more than likely hitting the same bug.

https://lore.kernel.org/all/48353e0d-e771-8a97-21d4-c65ff3bc4192@sentex.net

Powered by blists - more mailing lists