lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <25edecce-0795-3b00-a155-bfcc8499f1be@linux.ibm.com>
Date:   Wed, 30 Jun 2021 10:31:22 -0400
From:   Tony Krowiak <akrowiak@...ux.ibm.com>
To:     Halil Pasic <pasic@...ux.ibm.com>
Cc:     linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org,
        borntraeger@...ibm.com, cohuck@...hat.com,
        pasic@...ux.vnet.ibm.com, jjherne@...ux.ibm.com, jgg@...dia.com,
        alex.williamson@...hat.com, kwankhede@...dia.com,
        frankja@...ux.ibm.com, david@...hat.com, imbrenda@...ux.ibm.com,
        hca@...ux.ibm.com
Subject: Re: [PATCH] s390/vfio-ap: do not use open locks during
 VFIO_GROUP_NOTIFY_SET_KVM notification



On 6/28/21 4:29 PM, Halil Pasic wrote:
> On Fri, 25 Jun 2021 18:07:58 -0400
> Tony Krowiak <akrowiak@...ux.ibm.com> wrote:
>
> What is a suitable base for this patch. I've tried the usual suspects,
> but none of them worked.

I discovered what the problem is here. The patch is based on our
master branch along with the two pre-requisite patches that were
recently reviewed and are currently being merged. The two patches
of which I speak are:
* [PATCH v6 1/2] s390/vfio-ap: clean up mdev resources when remove 
callback invoked
    Message ID: <20210621155714.1198545-2-akrowiak@...ux.ibm.com>

* [PATCH v6 2/2] s390/vfio-ap: r/w lock for PQAP interception handler 
function pointer
    <20210621155714.1198545-3-akrowiak@...ux.ibm.com>

I probably should have included those along with this one.

>
>> The fix to resolve a lockdep splat while handling the
>> VFIO_GROUP_NOTIFY_SET_KVM event introduced a kvm_busy flag indicating that
>> the vfio_ap device driver is busy setting or unsetting the KVM pointer.
>> A wait queue was employed to allow functions requiring access to the KVM
>> pointer to wait for the kvm_busy flag to be cleared. For the duration of
>> the wait period, the mdev lock was unlocked then acquired again after the
>> kvm_busy flag was cleared. This got rid of the lockdep report, but didn't
>> really resolve the problem.
> Can you please elaborate on the last point. You mean that we can have
> circular locking even after 0cc00c8d4050, but instead of getting stuck in
> on a lock we will get stuck on wait_event_cmd()? If that is it, please
> state it clearly in the description, and if you can to it in the short
> description.

This patch was in response to the following review comments made by Jason
Gunthorpe:

* Message ID: <20210525162927.GC1002214@...dia.com>
    "... the kvm_busy should be replaced by a proper rwsem,
     don't try to open code locks like that - it just defeats lockdep
     analysis".

* Message ID: <20210527112433.GX1002214@...dia.com>
    "Usually when people start open coding locks it is often
    because lockdep complained. Open coding a lock makes
    lockdep stop because the lockdep code
    is removed, but it doesn't fix anything. The kvm_busy
    should be replaced by a proper rwsem, don't try to
    open code locks like that - it just defeats lockdep
    analysis."

I will paraphrase and include the information from Jason's
comments in the description.

>> This patch removes the the kvm_busy flag and wait queue as they are not
>> necessary to resolve the lockdep splat problem. The wait queue was
>> introduced to prevent changes to the matrix used to update the guest's
>> AP configuration. The idea was that whenever an adapter, domain or control
>> domain was being assigned to or unassigned from the matrix, the function
>> would wait until the group notifier function was no longer busy with the
>> KVM pointer.
>>
>> The thing is, the KVM pointer value (matrix_mdev->kvm) is always set and
>> cleared while holding the matrix_dev->lock mutex. The assignment and
>> unassignment interfaces also lock the matrix_dev->lock mutex prior to
>> checking whether the matrix_mdev->kvm pointer is set and if so, returns
>> the -EBUSY error from the function. Consequently, there is no chance for
>> an update to the matrix to occur while the guest's AP configuration is
>> being updated.
>>
>> Fixes: 0cc00c8d4050 ("s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks")
>> Cc: stable@...r.kernel.org
>> Signed-off-by: Tony Krowiak <akrowiak@...ux.ibm.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ