lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c54ef522-f348-df16-a99f-1e31feb1b0bd@linux.ibm.com>
Date:   Tue, 25 May 2021 11:56:50 -0400
From:   Tony Krowiak <akrowiak@...ux.ibm.com>
To:     Jason Gunthorpe <jgg@...dia.com>
Cc:     jjherne@...ux.ibm.com, linux-s390@...r.kernel.org,
        linux-kernel@...r.kernel.org, borntraeger@...ibm.com,
        cohuck@...hat.com, pasic@...ux.vnet.ibm.com,
        alex.williamson@...hat.com, kwankhede@...dia.com,
        frankja@...ux.ibm.com, david@...hat.com, imbrenda@...ux.ibm.com,
        hca@...ux.ibm.com
Subject: Re: [PATCH v4 2/2] s390/vfio-ap: control access to PQAP(AQIC)
 interception handler


> Why can't you put the locks in the right order? It looked trivial, I'm confused.
>
> Jason

I explained this in one of my responses to the previous series. Maybe I 
didn't do
a good job of it, so let me see if I can provide a more thorough 
explanation.

The handle_pqap() function in priv.c does not have any access to the
matrix_dev->lock which lives in the vfio_ap module, so there is no
way for this function to control the order of locking. The only lock
it can access is the lock provided for the hook function pointer which
must be held for the duration of the execution of the hook. When the
hook function (handle_pqap() in vfio_ap_ops.c) is called, it has to lock
the matrix_dev->lock mutex to do its thing. The interception of the
PQAP instruction that sets off the above scenario can happen simultaneously
with both the vfio_ap_mdev_set_kvm() and vfio_ap_mdev_unset_kvm()
instructions in vfio_ap_ops.c.

The vfio_ap_mdev_set_kvm() function is called only by the group notifier
callback when the vfio_ap driver is notified that the KVM pointer has 
been set.
In this case, we could set the lock for the hook function before setting
the matrix_dev->lock and calling the vfio_ap_mdev_set_kvm() function and
all would be well.

The vfio_ap_mdev_unset_kvm() function, however, is called both by the group
notifier when the KVM pointer has been cleared or when the mdev is
being removed. In both cases, the only way to get the KVM pointer - which
is needed to unplug the AP resources from the guest - is from the 
matrix_mdev
which contains it. This, of course, needs to be done while holding the
matrix_dev->lock mutex. The vfio_ap_mdev_unset_kvm() function also
clears the hook function pointer, but can only get the lock used to 
control access
to it from the matrix_mdev; therein lies the rub. So we can have the 
following
scenario which is flagged by lockdep:

CPU x:                                            CPU y:
--------                                             --------
                                                       lock the 
matrix_dev->lock:
vfio_ap_mdev_set_kvm in vfio_ap_ops.c

lock the hook pointer:
handle_pqap in priv.c

lock the matrix_dev->lock
handle_pqap in vfio_ap_ops.c

                                                       lock the hook 
pointer:
vfio_ap_mdev_set_kvm in vfio_ap_ops.c


Maybe I'm missing something, but I was unable to find a way around this when
the hook function pointer and its locking mechanism is stored in a field 
of a satellite
structure of struct kvm.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ