lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 2 Jun 2022 15:19:43 -0400
From:   Tony Krowiak <akrowiak@...ux.ibm.com>
To:     jjherne@...ux.ibm.com, linux-s390@...r.kernel.org,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Cc:     freude@...ux.ibm.com, borntraeger@...ibm.com, cohuck@...hat.com,
        mjrosato@...ux.ibm.com, pasic@...ux.ibm.com,
        alex.williamson@...hat.com, kwankhede@...dia.com,
        fiuczy@...ux.ibm.com
Subject: Re: [PATCH v19 15/20] s390/vfio-ap: implement in-use callback for
 vfio_ap driver



On 6/2/22 2:16 PM, Jason J. Herne wrote:
> On 4/4/22 18:10, Tony Krowiak wrote:
>> Let's implement the callback to indicate when an APQN
>> is in use by the vfio_ap device driver. The callback is
>> invoked whenever a change to the apmask or aqmask would
>> result in one or more queue devices being removed from the driver. The
>> vfio_ap device driver will indicate a resource is in use
>> if the APQN of any of the queue devices to be removed are assigned to
>> any of the matrix mdevs under the driver's control.
>>
>> There is potential for a deadlock condition between the
>> matrix_dev->guests_lock used to lock the guest during assignment of
>> adapters and domains and the ap_perms_mutex locked by the AP bus when
>> changes are made to the sysfs apmask/aqmask attributes.
>>
>> The AP Perms lock controls access to the objects that store the adapter
>> numbers (ap_perms) and domain numbers (aq_perms) for the sysfs
>> /sys/bus/ap/apmask and /sys/bus/ap/aqmask attributes. These attributes
>> identify which queues are reserved for the zcrypt default device 
>> drivers.
>> Before allowing a bit to be removed from either mask, the AP bus must 
>> check
>> with the vfio_ap device driver to verify that none of the queues are
>> assigned to any of its mediated devices.
>>
>> The apmask/aqmask attributes can be written or read at any time from
>> userspace, so care must be taken to prevent a deadlock with asynchronous
>> operations that might be taking place in the vfio_ap device driver. For
>> example, consider the following:
>>
>> 1. A system administrator assigns an adapter to a mediated device 
>> under the
>>     control of the vfio_ap device driver. The driver will need to 
>> first take
>>     the matrix_dev->guests_lock to potentially hot plug the adapter into
>>     the KVM guest.
>> 2. At the same time, a system administrator sets a bit in the sysfs
>>     /sys/bus/ap/ap_mask attribute. To complete the operation, the AP bus
>>     must:
>>     a. Take the ap_perms_mutex lock to update the object storing the 
>> values
>>        for the /sys/bus/ap/ap_mask attribute.
>>     b. Call the vfio_ap device driver's in-use callback to verify 
>> that the
>>        queues now being reserved for the default zcrypt drivers are not
>>        assigned to a mediated device owned by the vfio_ap device 
>> driver. To
>>        do the verification, the in-use callback function takes the
>>        matrix_dev->guests_lock, but has to wait because it is already 
>> held
>>        by the operation in 1 above.
>> 3. The vfio_ap device driver calls an AP bus function to verify that the
>>     new queues resulting from the assignment of the adapter in step 1 
>> are
>>     not reserved for the default zcrypt device driver. This AP bus 
>> function
>>     tries to take the ap_perms_mutex lock but gets stuck waiting for the
>>     waiting for the lock due to step 2a above.
>>
>> Consequently, we have the following deadlock situation:
>>
>> matrix_dev->guests_lock locked (1)
>> ap_perms_mutex lock locked (2a)
>> Waiting for matrix_dev->gusts_lock (2b) which is currently held (1)
>> Waiting for ap_perms_mutex lock (3) which is currently held (2a)
>>
>> To prevent this deadlock scenario, the function called in step 3 will no
>> longer take the ap_perms_mutex lock and require the caller to take the
>> lock. The lock will be the first taken by the adapter/domain assignment
>> functions in the vfio_ap device driver to maintain the proper locking
>> order.
>>
>> Signed-off-by: Tony Krowiak <akrowiak@...ux.ibm.com>
>> ---
>>   drivers/s390/crypto/ap_bus.c          | 31 ++++++++----
>>   drivers/s390/crypto/vfio_ap_drv.c     |  1 +
>>   drivers/s390/crypto/vfio_ap_ops.c     | 68 +++++++++++++++++++++++++++
>>   drivers/s390/crypto/vfio_ap_private.h |  2 +
>>   4 files changed, 94 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
>> index fdf16cb70881..f48552e900a2 100644
>> --- a/drivers/s390/crypto/ap_bus.c
>> +++ b/drivers/s390/crypto/ap_bus.c
>> @@ -817,6 +817,17 @@ static void ap_bus_revise_bindings(void)
>>       bus_for_each_dev(&ap_bus_type, NULL, NULL, __ap_revise_reserved);
>>   }
>>   +/**
>> + * ap_apqn_in_matrix_owned_by_def_drv: indicates whether an APQN c 
>> is reserved
>> + *                       for the host drivers or not.
>> + * @card: the APID of the adapter card to check
>> + * @queue: the APQI of the queue to check
>> + *
>> + * Note: the ap_perms_mutex must be locked by the caller of this 
>> function.
>> + *
>> + * Return: an int specifying whether the APQN is reserved for the 
>> host (1) or
>> + *       not (0)
>> + */
>
> Comment's function name does not match real function name. Also APQN 
> "c", from
> description does not appear to exist.

No, it definitely does not match and the 'c' is an extraneous typo. I'll 
fix both.

>
>
>
>>   int ap_owned_by_def_drv(int card, int queue)
>>   {
>>       int rc = 0;
>> @@ -824,25 +835,31 @@ int ap_owned_by_def_drv(int card, int queue)
>>       if (card < 0 || card >= AP_DEVICES || queue < 0 || queue >= 
>> AP_DOMAINS)
>>           return -EINVAL;
>>   -    mutex_lock(&ap_perms_mutex);
>> -
>>       if (test_bit_inv(card, ap_perms.apm)
>>           && test_bit_inv(queue, ap_perms.aqm))
>>           rc = 1;
>>   -    mutex_unlock(&ap_perms_mutex);
>> -
>>       return rc;
>>   }
>>   EXPORT_SYMBOL(ap_owned_by_def_drv);
>>   +/**
>> + * ap_apqn_in_matrix_owned_by_def_drv: indicates whether every APQN 
>> contained in
>> + *                       a set is reserved for the host drivers
>> + *                       or not.
>> + * @apm: a bitmap specifying a set of APIDs comprising the APQNs to 
>> check
>> + * @aqm: a bitmap specifying a set of APQIs comprising the APQNs to 
>> check
>> + *
>> + * Note: the ap_perms_mutex must be locked by the caller of this 
>> function.
>> + *
>> + * Return: an int specifying whether each APQN is reserved for the 
>> host (1) or
>> + *       not (0)
>> + */
>>   int ap_apqn_in_matrix_owned_by_def_drv(unsigned long *apm,
>>                          unsigned long *aqm)
>>   {
>>       int card, queue, rc = 0;
>>   -    mutex_lock(&ap_perms_mutex);
>> -
>>       for (card = 0; !rc && card < AP_DEVICES; card++)
>>           if (test_bit_inv(card, apm) &&
>>               test_bit_inv(card, ap_perms.apm))
>> @@ -851,8 +868,6 @@ int ap_apqn_in_matrix_owned_by_def_drv(unsigned 
>> long *apm,
>>                       test_bit_inv(queue, ap_perms.aqm))
>>                       rc = 1;
>>   -    mutex_unlock(&ap_perms_mutex);
>> -
>>       return rc;
>>   }
>>   EXPORT_SYMBOL(ap_apqn_in_matrix_owned_by_def_drv);
>> diff --git a/drivers/s390/crypto/vfio_ap_drv.c 
>> b/drivers/s390/crypto/vfio_ap_drv.c
>> index c258e5f7fdfc..2c3084589347 100644
>> --- a/drivers/s390/crypto/vfio_ap_drv.c
>> +++ b/drivers/s390/crypto/vfio_ap_drv.c
>> @@ -107,6 +107,7 @@ static const struct attribute_group 
>> vfio_queue_attr_group = {
>>   static struct ap_driver vfio_ap_drv = {
>>       .probe = vfio_ap_mdev_probe_queue,
>>       .remove = vfio_ap_mdev_remove_queue,
>> +    .in_use = vfio_ap_mdev_resource_in_use,
>>       .ids = ap_queue_ids,
>>   };
>>   diff --git a/drivers/s390/crypto/vfio_ap_ops.c 
>> b/drivers/s390/crypto/vfio_ap_ops.c
>> index 49ed54dc9e05..3ece2cd9f1e7 100644
>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>> @@ -902,6 +902,21 @@ static int 
>> vfio_ap_mdev_verify_no_sharing(unsigned long *mdev_apm,
>>       return 0;
>>   }
>>   +/**
>> + * vfio_ap_mdev_validate_masks - verify that the APQNs assigned to 
>> the mdev are
>> + *                 not reserved for the default zcrypt driver and
>> + *                 are not assigned to another mdev.
>> + *
>> + * @matrix_mdev: the mdev to which the APQNs being validated are 
>> assigned.
>> + *
>> + * Return: One of the following values:
>> + * o the error returned from the 
>> ap_apqn_in_matrix_owned_by_def_drv() function,
>> + *   most likely -EBUSY indicating the ap_perms_mutex lock is 
>> already held.
>> + * o EADDRNOTAVAIL if an APQN assigned to @matrix_mdev is reserved 
>> for the
>> + *           zcrypt default driver.
>> + * o EADDRINUSE if an APQN assigned to @matrix_mdev is assigned to 
>> another mdev
>> + * o A zero indicating validation succeeded.
>> + */
>>   static int vfio_ap_mdev_validate_masks(struct ap_matrix_mdev 
>> *matrix_mdev)
>>   {
>>       if (ap_apqn_in_matrix_owned_by_def_drv(matrix_mdev->matrix.apm,
>> @@ -951,6 +966,10 @@ static void vfio_ap_mdev_link_adapter(struct 
>> ap_matrix_mdev *matrix_mdev,
>>    *       An APQN derived from the cross product of the APID being 
>> assigned
>>    *       and the APQIs previously assigned is being used by another 
>> mediated
>>    *       matrix device
>> + *
>> + *    5. -EAGAIN
>> + *       A lock required to validate the mdev's AP configuration 
>> could not
>> + *       be obtained.
>>    */
>>   static ssize_t assign_adapter_store(struct device *dev,
>>                       struct device_attribute *attr,
>> @@ -961,6 +980,7 @@ static ssize_t assign_adapter_store(struct device 
>> *dev,
>>       DECLARE_BITMAP(apm_delta, AP_DEVICES);
>>       struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(dev);
>>   +    mutex_lock(&ap_perms_mutex);
>>       get_update_locks_for_mdev(matrix_mdev);
>>         ret = kstrtoul(buf, 0, &apid);
>> @@ -991,6 +1011,7 @@ static ssize_t assign_adapter_store(struct 
>> device *dev,
>>       ret = count;
>>   done:
>>       release_update_locks_for_mdev(matrix_mdev);
>> +    mutex_unlock(&ap_perms_mutex);
>>         return ret;
>>   }
>> @@ -1144,6 +1165,10 @@ static void vfio_ap_mdev_link_domain(struct 
>> ap_matrix_mdev *matrix_mdev,
>>    *       An APQN derived from the cross product of the APQI being 
>> assigned
>>    *       and the APIDs previously assigned is being used by another 
>> mediated
>>    *       matrix device
>> + *
>> + *    5. -EAGAIN
>> + *       The lock required to validate the mdev's AP configuration 
>> could not
>> + *       be obtained.
>>    */
>>   static ssize_t assign_domain_store(struct device *dev,
>>                      struct device_attribute *attr,
>> @@ -1154,6 +1179,7 @@ static ssize_t assign_domain_store(struct 
>> device *dev,
>>       DECLARE_BITMAP(aqm_delta, AP_DOMAINS);
>>       struct ap_matrix_mdev *matrix_mdev = dev_get_drvdata(dev);
>>   +    mutex_lock(&ap_perms_mutex);
>>       get_update_locks_for_mdev(matrix_mdev);
>>         ret = kstrtoul(buf, 0, &apqi);
>> @@ -1184,6 +1210,7 @@ static ssize_t assign_domain_store(struct 
>> device *dev,
>>       ret = count;
>>   done:
>>       release_update_locks_for_mdev(matrix_mdev);
>> +    mutex_unlock(&ap_perms_mutex);
>>         return ret;
>>   }
>> @@ -1868,3 +1895,44 @@ void vfio_ap_mdev_remove_queue(struct 
>> ap_device *apdev)
>>       kfree(q);
>>       release_update_locks_for_mdev(matrix_mdev);
>>   }
>> +
>> +/**
>> + * vfio_ap_mdev_resource_in_use: check whether any of a set of APQNs is
>> + *                 assigned to a mediated device under the control
>> + *                 of the vfio_ap device driver.
>> + *
>> + * @apm: a bitmap specifying a set of APIDs comprising the APQNs to 
>> check.
>> + * @aqm: a bitmap specifying a set of APQIs comprising the APQNs to 
>> check.
>> + *
>> + * This function is invoked by the AP bus when changes to the 
>> apmask/aqmask
>> + * attributes will result in giving control of the queue devices 
>> specified via
>> + * @apm and @aqm to the default zcrypt device driver. Prior to 
>> calling this
>> + * function, the AP bus locks the ap_perms_mutex. If this function 
>> is called
>> + * while an adapter or domain is being assigned to a mediated 
>> device, the
>> + * assignment operations will take the matrix_dev->guests_lock and
>> + * matrix_dev->mdevs_lock then call the 
>> ap_apqn_in_matrix_owned_by_def_drv
>> + * function, which also locks the ap_perms_mutex. This could result 
>> in a
>> + * deadlock.
>> + *
>> + * To avoid a deadlock, this function will verify that the
>> + * matrix_dev->guests_lock and matrix_dev->mdevs_lock are not 
>> currently held and
>> + * will return -EBUSY if the locks can not be obtained.
>
> This part of the comment does not seem to match reality. Unless I'm 
> missing
> something? Did you mean to call mutex_trylock? Or is the comment stale?

The comment was written prior to the changes introduced to avoid the locking
dependency (i.e., taking the ap_perms_mutex lock by the assignment functions
which I believe was your suggestion). I shall remove the comment.

>
>> + * Return:
>> + *    * -EBUSY if the locks required by this function are already 
>> locked.
>> + *    * -EADDRINUSE if one or more of the APQNs specified via 
>> @apm/@aqm are
>> + *      assigned to a mediated device under the control of the vfio_ap
>> + *      device driver.
>> + */
>> +int vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long 
>> *aqm)
>> +{
>> +    int ret;
>> +
>> +    mutex_lock(&matrix_dev->guests_lock);
>> +    mutex_lock(&matrix_dev->mdevs_lock);
>> +    ret = vfio_ap_mdev_verify_no_sharing(apm, aqm);
>> +    mutex_unlock(&matrix_dev->mdevs_lock);
>> +    mutex_unlock(&matrix_dev->guests_lock);
>> +
>> +    return ret;
>> +}
>> diff --git a/drivers/s390/crypto/vfio_ap_private.h 
>> b/drivers/s390/crypto/vfio_ap_private.h
>> index 6d4ca39f783b..cbffa0bd01da 100644
>> --- a/drivers/s390/crypto/vfio_ap_private.h
>> +++ b/drivers/s390/crypto/vfio_ap_private.h
>> @@ -147,4 +147,6 @@ void vfio_ap_mdev_unregister(void);
>>   int vfio_ap_mdev_probe_queue(struct ap_device *queue);
>>   void vfio_ap_mdev_remove_queue(struct ap_device *queue);
>>   +int vfio_ap_mdev_resource_in_use(unsigned long *apm, unsigned long 
>> *aqm);
>> +
>>   #endif /* _VFIO_AP_PRIVATE_H_ */
>
>

Powered by blists - more mailing lists