linux-kernel - Re: [PATCH] s390/vfio-ap: do not open code locks for VFIO_GROUP_NOTIFY_SET

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0e5fae99-762e-8060-c9ed-36674effa68b@linux.ibm.com>
Date:   Thu, 15 Jul 2021 10:38:54 -0400
From:   Tony Krowiak <akrowiak@...ux.ibm.com>
To:     Halil Pasic <pasic@...ux.ibm.com>
Cc:     linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org,
        borntraeger@...ibm.com, cohuck@...hat.com,
        pasic@...ux.vnet.ibm.com, jjherne@...ux.ibm.com, jgg@...dia.com,
        alex.williamson@...hat.com, kwankhede@...dia.com,
        frankja@...ux.ibm.com, david@...hat.com, imbrenda@...ux.ibm.com,
        hca@...ux.ibm.com
Subject: Re: [PATCH] s390/vfio-ap: do not open code locks for
 VFIO_GROUP_NOTIFY_SET_KVM notification



On 7/15/21 9:44 AM, Halil Pasic wrote:
> On Wed,  7 Jul 2021 11:41:56 -0400
> Tony Krowiak <akrowiak@...ux.ibm.com> wrote:
>
> First sorry for being this late with having a more serious look at the
> code.
>
>
>> @@ -270,6 +270,9 @@ static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q,
>>    * We take the matrix_dev lock to ensure serialization on queues and
>>    * mediated device access.
>>    *
>> + * Note: This function must be called with a read lock held on
>> + *	 vcpu->kvm->arch.crypto.pqap_hook_rwsem.
>> + *
>
> That is a fine synchronization for the pqap_hook, but I don't think it
> is sufficient for everything.
>
>
>>    * Return 0 if we could handle the request inside KVM.
>>    * otherwise, returns -EOPNOTSUPP to let QEMU handle the fault.
>>    */
>> @@ -287,22 +290,12 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
>>   		return -EOPNOTSUPP;
>>
>>   	apqn = vcpu->run->s.regs.gprs[0] & 0xffff;
>> -	mutex_lock(&matrix_dev->lock);
> Here you drop a matrix_dev->lock critical section. And then
> you do all the interesting stuff. E.g.
> q = vfio_ap_get_queue(matrix_mdev, apqn);
> and
> vfio_ap_irq_enable(q, status & 0x07, vcpu->run->s.regs.gprs[2]);.
> Since in vfio_ap_get_queue() we do the check if the queue belongs
> to the given guest, and examine the matrix (apm, aqm) I suppose
> that needs to be done holding a lock that protects the matrix,
> and to my best knowledge this is still matrix_dev->lock. It would
> probably make sense to convert matrix_dev->lock into an rw_semaphore,
> or to introduce a some new rwlock which protects less state in the
> future, but right now AFAICT it is still matrix_dev->lock.
>
> So I don't think this patch should pass review.

Good catch. In an earlier patch review, Jason G suggested locking the 
kvm->lock
mutex outside of the kvm_arch_crypto_set_masks() and 
kvm_arch_crypto_clear_masks()
functions to resolve the lockdep splat resulting from locking the 
matrix_dev->lock
mutex prior to the kvm->lock mutex. I believe this will allow me to remove
the kvm_busy/wait queue scenario without introducing a new rwsem.

>
> Regards,
> Halil
>
>>   	if (!vcpu->kvm->arch.crypto.pqap_hook)
>>   		goto out_unlock;
>>   	matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook,
>>   				   struct ap_matrix_mdev, pqap_hook);
>>
>> -	/*
>> -	 * If the KVM pointer is in the process of being set, wait until the
>> -	 * process has completed.
>> -	 */
>> -	wait_event_cmd(matrix_mdev->wait_for_kvm,
>> -		       !matrix_mdev->kvm_busy,
>> -		       mutex_unlock(&matrix_dev->lock),
>> -		       mutex_lock(&matrix_dev->lock));
>> -
>>   	/* If the there is no guest using the mdev, there is nothing to do */
>>   	if (!matrix_mdev->kvm)
>>   		goto out_unlock;