[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c27236d3-6125-4049-6268-3d9c93cf3ef2@oracle.com>
Date: Wed, 5 Dec 2018 09:01:33 -0500
From: Boris Ostrovsky <boris.ostrovsky@...cle.com>
To: Roger Pau Monné <roger.pau@...rix.com>,
Chao Gao <chao.gao@...el.com>
Cc: linux-kernel@...r.kernel.org, Juergen Gross <jgross@...e.com>,
Stefano Stabellini <sstabellini@...nel.org>,
Jia-Ju Bai <baijiaju1990@...il.com>,
xen-devel@...ts.xenproject.org, Jan Beulich <jbeulich@...e.com>
Subject: Re: [PATCH] xen: xen-pciback: Reset MSI-X state when exposing a
device
On 12/5/18 4:32 AM, Roger Pau Monné wrote:
> On Wed, Dec 05, 2018 at 10:19:17AM +0800, Chao Gao wrote:
>> I find some pass-thru devices don't work any more across guest reboot.
>> Assigning it to another guest also meets the same issue. And the only
>> way to make it work again is un-binding and binding it to pciback.
>> Someone reported this issue one year ago [1]. More detail also can be
>> found in [2].
>>
>> The root-cause is Xen's internal MSI-X state isn't reset properly
>> during reboot or re-assignment. In the above case, Xen set maskall bit
>> to mask all MSI interrupts after it detected a potential security
>> issue. Even after device reset, Xen didn't reset its internal maskall
>> bit. As a result, maskall bit would be set again in next write to
>> MSI-X message control register.
>>
>> Given that PHYSDEVOPS_prepare_msix() also triggers Xen resetting MSI-X
>> internal state of a device, we employ it to fix this issue rather than
>> introducing another dedicated sub-hypercall.
>>
>> Note that PHYSDEVOPS_release_msix() will fail if the mapping between
>> the device's msix and pirq has been created. This limitation prevents
>> us calling this function when detaching a device from a guest during
>> guest shutdown. Thus it is called right before calling
>> PHYSDEVOPS_prepare_msix().
> s/PHYSDEVOPS/PHYSDEVOP/ (no final S). And then I would also drop the
> () at the end of the hypercall name since it's not a function.
>
> I'm also wondering why the release can't be done when the device is
> detached from the guest (or the guest has been shut down). This makes
> me worry about the raciness of the attach/detach procedure: if there's
> a state where pciback assumes the device has been detached from the
> guest, but there are still pirqs bound, an attempt to attach to
> another guest in such state will fail.
I wonder whether this additional reset functionality could be done out
of xen_pcibk_xenbus_remove(). We first do a (best effort) device reset
and then do the extra things that are not properly done there.
-boris
Powered by blists - more mailing lists