linux-kernel - Re: [PATCH v4 09/13] KVM: xen: automatically use the vcpu_info embedded in shared

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7de61c43-e88d-9c22-7e5f-2229d8c49523@xen.org>
Date:   Tue, 19 Sep 2023 16:47:51 +0100
From:   Paul Durrant <xadimgnik@...il.com>
To:     David Woodhouse <dwmw2@...radead.org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Cc:     Paul Durrant <pdurrant@...zon.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org
Subject: Re: [PATCH v4 09/13] KVM: xen: automatically use the vcpu_info
 embedded in shared_info

On 19/09/2023 16:38, David Woodhouse wrote:
> On Tue, 2023-09-19 at 15:34 +0100, Paul Durrant wrote:
>>>> +       ret = kvm_gpc_activate(vi_gpc, gpa, sizeof(struct vcpu_info));
>>>
>>>   From this moment, can't interrupts be delivered to the new vcpu_info,
>>> even though the memcpy hasn't happened yet?
>>>
>>
>> Hmm, that's a good point. TBH it would be nice to have an 'activate and
>> leave locked' primitive to avoid this.
> 
> I suppose so from the caller's point of view in this case, but I'm
> somewhat disinclined to add that complexity to the pfncache code.
> 
> We take the refresh_lock *mutex* in __kvm_gpc_refresh() so it's not as
> simple as declaring that said function is called with the gpc rwlock
> already held.
> 
> We also do the final gpc_unmap_khva() of the old mapping after dropping
> the lock; *could* we call that with a write lock held? A write lock
> which is going to be taken the MM notifier callbacks? Well, maybe not
> in the case of the first *activate* which isn't really a 'refresh' per
> se but the whole thing is making my skin itch. I don't like it.
> 
>>> I think we need to ensure that any kvm_xen_set_evtchn_fast() which
>>> happens at this point cannot proceed, and falls back to the slow path.
>>>
>>> Can we set a flag before we activate the vcpu_info and clear it after
>>> the memcpy is done, then make kvm_xen_set_evtchn_fast() return
>>> EWOULDBLOCK whenever that flag is set?
>>>
>>> The slow path in kvm_xen_set_evtchn() takes kvm->arch.xen.xen_lock and
>>> I think kvm_xen_vcpu_set_attr() has taken that same lock before you get
>>> to this code, so it works out nicely?
>>>
>>
>> Yes, I think that is safe... but if we didn't have the window between
>> activating the vcpu_info cache and doing the copy we'd also be ok I
>> think... Or perhaps we could simply preserve evtchn_pending_sel and copy
>> the rest of it?
> 
>>
> I suppose you could just write the evtchn_pending_sel word in the new
> vcpu_info GPA to zero before setting up the pfncache for it.
> 
> When when you do the memcpy, you don't *just* memcpy the
> evtchn_pending_sel word; you use the bitwise OR of the old and new, so
> you catch any bits which got set in the new word in the interim?
> 
> But then again, who moves the vcpu_info while there are actually
> interrupts in-flight to the vCPU in question? Maybe we just declare
> that we don't care, and that interrupts may be lost in that case? Even
> if *Xen* wouldn't have lost them (and I don't even know that part is
> true).
> 
>>> This adds a new lock ordering rule of the vcpu_info lock(s) before the
>>> shared_info lock. I don't know that it's *wrong* but it seems weird to
>>> me; I expected the shared_info to come first?
>>>
>>> I avoided taking both at once in kvm_xen_set_evtchn_fast(), although
>>> maybe if we are going to have a rule that allows both, we could revisit
>>> that. Suspect it isn't needed.
>>>
>>> Either way it is worth a clear comment somewhere to document the lock
>>> ordering, and I'd also like to know this has been tested with lockdep,
>>> which is often cleverer than me.
>>>
>>
>> Ok. I agree that shared_info before vcpu_info does seem more intuitive
>> and maybe it would be better given the code in
>> kvm_xen_set_evtchn_fast(). I'll seem how messy it gets in re-ordering
>> and add a comment as you suggest.
>>
> 
> I think they look interchangeable in this case. If we *do* take them
> both in kvm_xen_set_evtchn_fast() then maybe we can simplify the slow
> path where it set the bits in shared_info but then the vcpu_info gpc
> was invalid. That currently uses a kvm->arch.xen.evtchn_pending_sel
> shadow of the bits, and just kicks the vCPU to deliver them for
> itself... but maybe that whole thing could be dropped, and
> kvm_xen_set_evtchn_fast() can just return EWOULDBLOCK if it fails to
> lock *both* shared_info and vcpu_info at the same time?
> 

Yes, I think that sounds like a neater approach.

> I didn't do that before, because I didn't want to introduce lock
> ordering rules. But I'm happier to do so now. And I think we can ditch
> a lot of hairy asm in kvm_xen_inject_pending_events() ?
> 

Messing with the asm sounds like something for a follow-up though.

   Paul