linux-kernel - Re: [PATCH] s390/vfio-ap: fix unregister GISC when KVM is already gone results in OOPS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3795bc75-9d5e-2098-fd18-f1cbaef9c290@linux.ibm.com>
Date:   Fri, 25 Sep 2020 18:29:16 -0400
From:   Tony Krowiak <akrowiak@...ux.ibm.com>
To:     Halil Pasic <pasic@...ux.ibm.com>
Cc:     linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org, pmorel@...ux.ibm.com,
        alex.williamson@...hat.com, cohuck@...hat.com,
        kwankhede@...dia.com, borntraeger@...ibm.com
Subject: Re: [PATCH] s390/vfio-ap: fix unregister GISC when KVM is already
 gone results in OOPS



On 9/21/20 11:45 AM, Halil Pasic wrote:
> On Fri, 18 Sep 2020 13:02:34 -0400
> Tony Krowiak <akrowiak@...ux.ibm.com> wrote:
>
>> Attempting to unregister Guest Interruption Subclass (GISC) when the
>> link between the matrix mdev and KVM has been removed results in the
>> following:
>>
>>     "Kernel panic -not syncing: Fatal exception: panic_on_oops"
>>
>> This patch fixes this bug by verifying the matrix mdev and KVM are still
>> linked prior to unregistering the GISC.
>
> I read from your commit message that this happens when the link between
> the KVM and the matrix mdev was established and then got severed.
>
> I assume the interrupts were previously enabled, and were not been
> disabled or cleaned up because q->saved_isc != VFIO_AP_ISC_INVALID.
>
> That means the guest enabled  interrupts and then for whatever
> reason got destroyed, and this happens on mdev cleanup.
>
> Does it happen all the time or is it some sort of a race?

This is a race condition that happens when a guest is terminated and the 
mdev is
removed in rapid succession. I came across it with one of my hades test 
cases
on cleanup of the resources after the test case completes. There is a 
bug in the problem appears
the vfio_ap_mdev_release function because it tries to reset the APQNs 
after the bits are
cleared from the matrix_mdev.matrix, so the resets never happen.

Fixing that, however, does not resolve the issue, so I'm in the process 
of doing a bunch of
tracing to see the flow of the resets etc. during the lifecycle of the 
mdev during this
hades test. I should have a better answer next week.

>
>> Signed-off-by: Tony Krowiak <akrowiak@...ux.ibm.com>
>> ---
>>   drivers/s390/crypto/vfio_ap_ops.c | 14 +++++++++-----
>>   1 file changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
>> index e0bde8518745..847a88642644 100644
>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>> @@ -119,11 +119,15 @@ static void vfio_ap_wait_for_irqclear(int apqn)
>>    */
>>   static void vfio_ap_free_aqic_resources(struct vfio_ap_queue *q)
>>   {
>> -	if (q->saved_isc != VFIO_AP_ISC_INVALID && q->matrix_mdev)
>> -		kvm_s390_gisc_unregister(q->matrix_mdev->kvm, q->saved_isc);
>> -	if (q->saved_pfn && q->matrix_mdev)
>> -		vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev),
>> -				 &q->saved_pfn, 1);
>> +	if (q->matrix_mdev) {
>> +		if (q->saved_isc != VFIO_AP_ISC_INVALID && q->matrix_mdev->kvm)
>> +			kvm_s390_gisc_unregister(q->matrix_mdev->kvm,
>> +						 q->saved_isc);
> I don't quite understand the logic here. I suppose we need to ensure
> that the struct kvm is 'alive' at least until kvm_s390_gisc_unregister()
> is done. That is supposed be ensured by kvm_get_kvm() in
> vfio_ap_mdev_set_kvm() and kvm_put_kvm() in vfio_ap_mdev_release().
>
> If the critical section in vfio_ap_mdev_release() is done and
> matrix_mdev->kvm was set to NULL there then I would expect that the
> queues are already reset and q->saved_isc == VFIO_AP_ISC_INVALID. So
> this should not blow up.
>
> Now if this happens before the critical section in
> vfio_ap_mdev_release() is done, I ask myself how are we going to do the
> kvm_put_kvm()?
>
> Another question. Do we hold the matrix_dev->lock here?
>
>> +		if (q->saved_pfn)
>> +			vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev),
>> +					 &q->saved_pfn, 1);
>> +	}
>> +
>>   	q->saved_pfn = 0;
>>   	q->saved_isc = VFIO_AP_ISC_INVALID;
>>   }