[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210611170526.GU1002214@nvidia.com>
Date: Fri, 11 Jun 2021 14:05:26 -0300
From: Jason Gunthorpe <jgg@...dia.com>
To: Tony Krowiak <akrowiak@...ux.ibm.com>
Cc: linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org,
borntraeger@...ibm.com, cohuck@...hat.com,
pasic@...ux.vnet.ibm.com, jjherne@...ux.ibm.com,
alex.williamson@...hat.com, kwankhede@...dia.com,
frankja@...ux.ibm.com, david@...hat.com, imbrenda@...ux.ibm.com,
hca@...ux.ibm.com
Subject: Re: [PATCH 2/3] s390/vfio-ap: introduce two new r/w locks to replace
wait_queue_head_t
On Wed, Jun 09, 2021 at 06:46:33PM -0400, Tony Krowiak wrote:
> This patch introduces two new r/w locks to replace the wait_queue_head_t
> that was introduced to fix a lockdep splat reported when testing
> pass-through of AP queues to a Secure Execution guest. This was the
> abbreviated dependency chain reported by lockdep that was fixed using
> a wait queue:
>
> kvm_arch_crypto_set_masks+0x4a/0x2b8 [kvm] kvm->lock
> vfio_ap_mdev_group_notifier+0x154/0x170 [vfio_ap] matrix_dev->lock
>
> handle_pqap+0x56/0x1d0 [vfio_ap] matrix_dev->lock
> kvm_vcpu_ioctl+0x2cc/0x898 [kvm] vcpu->mutex
>
> kvm_s390_cpus_to_pv+0x4e/0xf8 [kvm] vcpu->mutex
> kvm_arch_vm_ioctl+0x3ec/0x550 [kvm] kvm->lock
Is the problem larger than kvm_arch_crypto_set_masks()? If not it
looks easy enough to fix, just pull the kvm->lock out of
kvm_arch_crypto_set_masks() and obtain it in vfio_ap_mdev_set_kvm()
before the rwsem. Now your locks are in the right order and all should
be well?
> +static int vfio_ap_mdev_matrix_store_lock(struct ap_matrix_mdev *matrix_mdev)
> +{
> + if (!down_write_trylock(&matrix_mdev->rwsem))
> + return -EBUSY;
> +
> + if (matrix_mdev->kvm) {
> + up_write(&matrix_mdev->rwsem);
> + return -EBUSY;
> + }
> +
> + if (!down_write_trylock(&matrix_mdev->matrix.rwsem)) {
> + up_write(&matrix_mdev->rwsem);
> + return -EBUSY;
> + }
> +
> + return 0;
> +}
This double locking is quite strange, at least it deserves a detailed
comment? The comments suggest these locks protect distinct data so..
> +
> + ret = vfio_ap_mdev_matrix_store_lock(matrix_mdev);
> + if (ret)
> + return ret;
>
> clear_bit_inv((unsigned long)apqi, matrix_mdev->matrix.aqm);
here it obtained both locks but only touched matrix.aqm which is only
protected by the inner lock - what was the point of obtaining the
outer lock?
Also, not convinced down_write_trylock() is appropriate from a sysfs
callback, it should block and wait, surely? Otherwise userspace gets
random racy failures depending on what the kernel is doing??
Jason
Powered by blists - more mailing lists