lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210210234606.1d0dbdec.pasic@linux.ibm.com>
Date:   Wed, 10 Feb 2021 23:46:06 +0100
From:   Halil Pasic <pasic@...ux.ibm.com>
To:     Tony Krowiak <akrowiak@...ux.ibm.com>
Cc:     Cornelia Huck <cohuck@...hat.com>, linux-s390@...r.kernel.org,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        stable@...r.kernel.org, borntraeger@...ibm.com,
        kwankhede@...dia.com, pbonzini@...hat.com,
        alex.williamson@...hat.com, pasic@...ux.vnet.ibm.com
Subject: Re: [PATCH 1/1] s390/vfio-ap: fix circular lockdep when
 setting/clearing crypto masks

On Wed, 10 Feb 2021 17:05:48 -0500
Tony Krowiak <akrowiak@...ux.ibm.com> wrote:

> On 2/10/21 10:32 AM, Halil Pasic wrote:
> > On Wed, 10 Feb 2021 16:24:29 +0100
> > Halil Pasic <pasic@...ux.ibm.com> wrote:
> >  
> >>> Maybe you could
> >>> - grab a reference to kvm while holding the lock
> >>> - call the mask handling functions with that kvm reference
> >>> - lock again, drop the reference, and do the rest of the processing?  
> >> I agree, matrix_mdev->kvm can go NULL any time and we are risking
> >> a null pointer dereference here.
> >>
> >> Another idea would be to do
> >>
> >>
> >> static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
> >> {
> >>          struct kvm *kvm;
> >>                                                          
> >>          mutex_lock(&matrix_dev->lock);
> >>          if (matrix_mdev->kvm) {
> >>                  kvm = matrix_mdev->kvm;
> >>                  matrix_mdev->kvm = NULL;
> >>                  mutex_unlock(&matrix_dev->lock);
> >>                  kvm_arch_crypto_clear_masks(kvm);
> >>                  mutex_lock(&matrix_dev->lock);
> >>                  matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;  
> > s/matrix_mdev->kvm/kvm  
> >>                  vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
> >>                  kvm_put_kvm(kvm);
> >>          }
> >>          mutex_unlock(&matrix_dev->lock);
> >> }
> >>
> >> That way only one unset would actually do the unset and cleanup
> >> and every other invocation would bail out with only checking
> >> matrix_mdev->kvm.  
> > But the problem with that is that we enable the the assign/unassign
> > prematurely, which could interfere wit reset_queues(). Forget about
> > it.  
> 
> Not sure what you mean by this.
> 
> 

I mean because above I first do
(1) matrix_mdev->kvm = NULL;
and then do 
(2) vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
another thread could do 
static ssize_t unassign_adapter_store(struct device *dev,                       
                                      struct device_attribute *attr,            
                                      const char *buf, size_t count)            
{                                                                               
        int ret;                                                                
        unsigned long apid;                                                     
        struct mdev_device *mdev = mdev_from_dev(dev);                          
        struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);            
                                                                                
        /* If the guest is running, disallow un-assignment of adapter */        
        if (matrix_mdev->kvm)                                                   
                return -EBUSY;   
...
}
between (1) and (2), and we would not bail out with -EBUSY because !!kvm
because of (1). That means we would change matrix_mdev->matrix and we
would not reset the queues that correspond to the apid that was just
removed, because by the time we do the reset_queues, the queues are
not in the matrix_mdev->matrix any more.

Does that make sense?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ