lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210726223628.4d7759bf.pasic@linux.ibm.com>
Date:   Mon, 26 Jul 2021 22:36:28 +0200
From:   Halil Pasic <pasic@...ux.ibm.com>
To:     Tony Krowiak <akrowiak@...ux.ibm.com>
Cc:     linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org,
        borntraeger@...ibm.com, cohuck@...hat.com,
        pasic@...ux.vnet.ibm.com, jjherne@...ux.ibm.com, jgg@...dia.com,
        alex.williamson@...hat.com, kwankhede@...dia.com, david@...hat.com
Subject: Re: [PATCH 2/2] s390/vfio-ap: replace open coded locks for
 VFIO_GROUP_NOTIFY_SET_KVM notification

On Fri, 23 Jul 2021 17:24:16 -0400
Tony Krowiak <akrowiak@...ux.ibm.com> wrote:

> On 7/23/21 10:26 AM, Halil Pasic wrote:
> > On Thu, 22 Jul 2021 09:09:26 -0400
> > Tony Krowiak <akrowiak@...ux.ibm.com> wrote:
> >  
> >> On 7/21/21 10:45 AM, Halil Pasic wrote:  
> >>> On Mon, 19 Jul 2021 15:35:03 -0400
> >>> Tony Krowiak <akrowiak@...ux.ibm.com> wrote:
> >>>     
> >>>> It was pointed out during an unrelated patch review that locks should not  
> >>> [..]
> >>>     
> >>>> -static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
> >>>> +static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev,
> >>>> +				   struct kvm *kvm)
> >>>>    {
> >>>> -	/*
> >>>> -	 * If the KVM pointer is in the process of being set, wait until the
> >>>> -	 * process has completed.
> >>>> -	 */
> >>>> -	wait_event_cmd(matrix_mdev->wait_for_kvm,
> >>>> -		       !matrix_mdev->kvm_busy,
> >>>> -		       mutex_unlock(&matrix_dev->lock),
> >>>> -		       mutex_lock(&matrix_dev->lock));
> >>>> -
> >>>> -	if (matrix_mdev->kvm) {  
> >>> We used to check if matrix_mdev->kvm is null, but ...
> >>>     
> >>>> -		matrix_mdev->kvm_busy = true;
> >>>> -		mutex_unlock(&matrix_dev->lock);
> >>>> -
> >>>> -		if (matrix_mdev->kvm->arch.crypto.crycbd) {
> >>>> -			down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
> >>>> -			matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
> >>>> -			up_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
> >>>> -
> >>>> -			kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
> >>>> -		}
> >>>> +	if (kvm->arch.crypto.crycbd) {  
> >>> ... now we just try to dereference it. And ..  
> >> We used to check matrix_mdev->kvm, now the kvm pointer is passed into
> >> the function; however, having said that, the pointer passed in should be
> >> checked before de-referencing it.
> >>  
> >>>     
> >>>> +		down_write(&kvm->arch.crypto.pqap_hook_rwsem);
> >>>> +		kvm->arch.crypto.pqap_hook = NULL;
> >>>> +		up_write(&kvm->arch.crypto.pqap_hook_rwsem);
> >>>>
> >>>> +		mutex_lock(&kvm->lock);
> >>>>    		mutex_lock(&matrix_dev->lock);
> >>>> +
> >>>> +		kvm_arch_crypto_clear_masks(kvm);
> >>>>    		vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
> >>>> -		kvm_put_kvm(matrix_mdev->kvm);
> >>>> +		kvm_put_kvm(kvm);
> >>>>    		matrix_mdev->kvm = NULL;
> >>>> -		matrix_mdev->kvm_busy = false;
> >>>> -		wake_up_all(&matrix_mdev->wait_for_kvm);
> >>>> +
> >>>> +		mutex_unlock(&kvm->lock);
> >>>> +		mutex_unlock(&matrix_dev->lock);
> >>>>    	}
> >>>>    }
> >>>>     
> >>> [..]
> >>>     
> >>>> @@ -1363,14 +1323,11 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
> >>>>    {
> >>>>    	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
> >>>>
> >>>> -	mutex_lock(&matrix_dev->lock);
> >>>> -	vfio_ap_mdev_unset_kvm(matrix_mdev);
> >>>> -	mutex_unlock(&matrix_dev->lock);
> >>>> -  
> >>> .. before access to the matrix_mdev->kvm used to be protected by
> >>> the matrix_dev->lock ...
> >>>     
> >>>>    	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
> >>>>    				 &matrix_mdev->iommu_notifier);
> >>>>    	vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
> >>>>    				 &matrix_mdev->group_notifier);
> >>>> +	vfio_ap_mdev_unset_kvm(matrix_mdev, matrix_mdev->kvm);  
> >>> ... but it is not any more. BTW I don't think the code is guaranteed
> >>> to fetch ->kvm just once.  
> >> There are a couple of things to point out here:
> >> 1. The vfio_ap_mdev_unset_kvm function() is the only place where the
> >>       matrix_mdev->kvm pointer is cleared. That function is called here
> >>       as well as by the group notifier callback for VFIO_GROUP_NOTIFY_SET_KVM
> >>       events. If you notice in the code above, the group notifier is
> >> unregistered
> >>       before calling the unset function, so either the notifier will have
> >> already
> >>       been invoked and the pointer cleared (which is why you are correct
> >>       that the KVM pointer passed in needs to get checked in the unset
> >> function),
> >>       or will get cleared here.  
> > Hm, vfio_unregister_notifier() indeed seems to guarantee, that by the
> > time it returns no notifer is running. I didn't know that. But this
> > blocking notifier chain uses an rwsem. So mutual exclusion with
> > vfio_ap_mdev_open() is guaranteed, than it is indeed guaranteed. A quick
> > glance at the code didn't tell me if vfio/mdev guarantees that.
> >
> > I mean it would make sense to me to make the init and the cleanup
> > mutually exclusive, but I'm reluctant to just assume it is like that.
> > Can you please point me into the right direction?  
> 
> I'm not quite sure what you're asking for here, but I'll give it a
> shot. The notifier is registered by the vfio_ap_mdev_open()
> callback which is invoked in response to the opening of the mdev fd.

vfio_ap_mdev_open <-vfio_group_get_device_fd

> Since mdev fd can't be closed unless and until it's open,
> there is no way for the vfio_ap_mdev_release() callback, which
> is invoked when the mdev fd is closed, to execute at the same
> time as the vfio_ap_mdev_open callback. 

A plain old file you can open several times over (and thus close
several times over as well). So if you imagine something like:
thread A               thread B
--------               --------
open()
close()                open()
open()                 close()
close()

You may end up with open and close running interleaved. What I'
trying to say is, to my best knowledge, normally there is no
you have to close it before you open it again rule for files.

Does that make sense?

If there is no special mechanism to prevent such a scenario for the
mdev device, I guess, the second open (B) if you like would try to
register a group notifier. I'm not aware of such an mechanism. I 
had a brief look at vfio_group_get_device_fd but couldn't find any
such logic there.

Maybe Connie can actually help us with this one?



> Since the release
> callback unregisters the notifier and both the registration and
> unregistration are done under the same rwsem, I don't see how
> they can be anything but mutually exclusive. Am I missing something
> here?
> 
> >
> >  
> >> 2. The release callback is invoked when the mdev fd is closed by userspace.
> >>       The remove callback is the only place where the matrix_mdev is
> >> freed. The
> >>       remove callback is not called until the mdev fd is released, so it
> >> is guaranteed
> >>       the matrix_mdev will exist when the release callback is invoked.
> >> 3. The matrix_dev->lock is then taken in the vfio_ap_mdev_unset_kvm function
> >>       before doing any operations that modify the matrix_mdev.  
> > Yeah but both the reader, and the writer needs to use the same lock to
> > have the protected by the lock type of situation. That is why I asked
> > about the place where you read matrix_mdev members outside the
> > matrix_dev->lock.  
> 
> With this patch, the matrix_mdev is always written or read with
> the matrix_dev->lock mutex locked.

Sorry I don't understand. How is the struct matrix_mdev.kvm read
with the matrix_dev->lock mutex locked in:

static void vfio_ap_mdev_release(struct mdev_device *mdev)                      
{                                                                               
        struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);            
                                                                                
        vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,             
                                 &matrix_mdev->iommu_notifier);                 
        vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,             
                                 &matrix_mdev->group_notifier);                 
        vfio_ap_mdev_unset_kvm(matrix_mdev, matrix_mdev->kvm);   <==== HERE!
        module_put(THIS_MODULE);                                                
} 
> By moving the locking of the
> kvm->lock mutex out of the functions that set/clear the masks
> in the guest's CRYCB, we are now able to lock the kvm->lock
> mutex before locking the matrix_dev->lock mutex in both the
> vfio_ap_mdev_set_kvm() and vfio_ap_mdev_unset_kvm()
> functions. This precludes the need to unlock the
> matrix_dev->lock mutex while the masks are being set or
> cleared and alleviates the lockdep splat for which the open coded
> locks were created.


> 
> >
> > Regards,
> > Halil  
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ