[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ae6e5c7a-0159-035e-2bd3-0a749f81a7c0@de.ibm.com>
Date: Wed, 16 Dec 2020 10:58:48 +0100
From: Christian Borntraeger <borntraeger@...ibm.com>
To: Halil Pasic <pasic@...ux.ibm.com>
Cc: Tony Krowiak <akrowiak@...ux.ibm.com>, linux-s390@...r.kernel.org,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
stable@...r.kernel.org, gregkh@...uxfoundation.org,
sashal@...nel.org, cohuck@...hat.com, kwankhede@...dia.com,
pbonzini@...hat.com, alex.williamson@...hat.com,
pasic@...ux.vnet.ibm.com
Subject: Re: [PATCH v3] s390/vfio-ap: clean up vfio_ap resources when KVM
pointer invalidated
On 16.12.20 02:21, Halil Pasic wrote:
> On Tue, 15 Dec 2020 19:10:20 +0100
> Christian Borntraeger <borntraeger@...ibm.com> wrote:
>
>>
>>
>> On 15.12.20 11:57, Halil Pasic wrote:
>>> On Mon, 14 Dec 2020 11:56:17 -0500
>>> Tony Krowiak <akrowiak@...ux.ibm.com> wrote:
>>>
>>>> The vfio_ap device driver registers a group notifier with VFIO when the
>>>> file descriptor for a VFIO mediated device for a KVM guest is opened to
>>>> receive notification that the KVM pointer is set (VFIO_GROUP_NOTIFY_SET_KVM
>>>> event). When the KVM pointer is set, the vfio_ap driver takes the
>>>> following actions:
>>>> 1. Stashes the KVM pointer in the vfio_ap_mdev struct that holds the state
>>>> of the mediated device.
>>>> 2. Calls the kvm_get_kvm() function to increment its reference counter.
>>>> 3. Sets the function pointer to the function that handles interception of
>>>> the instruction that enables/disables interrupt processing.
>>>> 4. Sets the masks in the KVM guest's CRYCB to pass AP resources through to
>>>> the guest.
>>>>
>>>> In order to avoid memory leaks, when the notifier is called to receive
>>>> notification that the KVM pointer has been set to NULL, the vfio_ap device
>>>> driver should reverse the actions taken when the KVM pointer was set.
>>>>
>>>> Fixes: 258287c994de ("s390: vfio-ap: implement mediated device open callback")
>>>> Signed-off-by: Tony Krowiak <akrowiak@...ux.ibm.com>
>>>> ---
>>>> drivers/s390/crypto/vfio_ap_ops.c | 29 ++++++++++++++++++++---------
>>>> 1 file changed, 20 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
>>>> index e0bde8518745..cd22e85588e1 100644
>>>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>>>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>>>> @@ -1037,8 +1037,6 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
>>>> {
>>>> struct ap_matrix_mdev *m;
>>>>
>>>> - mutex_lock(&matrix_dev->lock);
>>>> -
>>>> list_for_each_entry(m, &matrix_dev->mdev_list, node) {
>>>> if ((m != matrix_mdev) && (m->kvm == kvm)) {
>>>> mutex_unlock(&matrix_dev->lock);
>>>> @@ -1049,7 +1047,6 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
>>>> matrix_mdev->kvm = kvm;
>>>> kvm_get_kvm(kvm);
>>>> kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
>>>> - mutex_unlock(&matrix_dev->lock);
>>>>
>>>> return 0;
>>>> }
>>>> @@ -1083,35 +1080,49 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb,
>>>> return NOTIFY_DONE;
>>>> }
>>>>
>>>> +static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
>>>> +{
>>>> + kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
>>>> + matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
>>>
>>>
>>> This patch LGTM. The only concern I have with it is whether a
>>> different cpu is guaranteed to observe the above assignment as
>>> an atomic operation. I think we didn't finish this discussion
>>> at v1, or did we?
>>
>> You mean just this assigment:
>>>> + matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
>> should either have the old or the new value, but not halve zero halve old?
>>
>
> Yes that is the assignment I was referring to. Old value will work as well because
> kvm holds a reference to this module while in the pqap_hook.
>
>> Normally this should be ok (and I would consider this a compiler bug if
>> this is split into 2 32 bit zeroes) But if you really want to be sure then we
>> can use WRITE_ONCE.
>
> Just my curiosity: what would make this a bug? Is it the s390 elf ABI,
> or some gcc feature, or even the C standard? Also how exactly would
> WRITE_ONCE, also access via volatile help in this particular situation?
I think its a tricky things and not strictly guaranteed, but there is a lot
of code that relies on the atomicity of word sizes. see for example the discussion
here
https://lore.kernel.org/lkml/CAHk-=wgC4+kV9AiLokw7cPP429rKCU+vjA8cWAfyOjC3MtqC4A@mail.gmail.com/
WRITE_ONCE will not change the guarantees a lot, but it is mostly a documentation
that we assume atomic access here.
>
> I agree, if the member is properly aligned, (which it is),
> normally/probably we are fine on s390x (which is also a given).
>
>> I think we take this via the s390 tree? I can add the WRITE_ONCE when applying?
>
> Yes that works fine with me.
>
> Reviewed-by: Halil Pasic <pasic@...ux.ibm.com>
>
Powered by blists - more mailing lists