lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 24 Sep 2018 15:22:13 +0200
From:   Harald Freudenberger <freude@...ux.ibm.com>
To:     Halil Pasic <pasic@...ux.ibm.com>,
        Cornelia Huck <cohuck@...hat.com>,
        Tony Krowiak <akrowiak@...ux.vnet.ibm.com>
Cc:     linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org, freude@...ibm.com, schwidefsky@...ibm.com,
        heiko.carstens@...ibm.com, borntraeger@...ibm.com,
        kwankhede@...dia.com, bjsdjshi@...ux.vnet.ibm.com,
        pbonzini@...hat.com, alex.williamson@...hat.com,
        pmorel@...ux.vnet.ibm.com, alifm@...ux.vnet.ibm.com,
        mjrosato@...ux.vnet.ibm.com, jjherne@...ux.vnet.ibm.com,
        thuth@...hat.com, pasic@...ux.vnet.ibm.com, berrange@...hat.com,
        fiuczy@...ux.vnet.ibm.com, buendgen@...ibm.com,
        frankja@...ux.ibm.com, Tony Krowiak <akrowiak@...ux.ibm.com>
Subject: Re: [PATCH v10 13/26] s390: vfio-ap: zeroize the AP queues

On 24.09.2018 14:16, Halil Pasic wrote:
>
> On 09/24/2018 01:36 PM, Cornelia Huck wrote:
>> On Wed, 12 Sep 2018 15:43:03 -0400
>> Tony Krowiak <akrowiak@...ux.vnet.ibm.com> wrote:
>>
>>> From: Tony Krowiak <akrowiak@...ux.ibm.com>
>>>
>>> Let's call PAPQ(ZAPQ) to zeroize a queue for each queue configured
>>> for a mediated matrix device when it is released.
>>>
>>> Zeroizing a queue resets the queue, clears all pending
>>> messages for the queue entries and disables adapter interruptions
>>> associated with the queue.
>>>
>>> Signed-off-by: Tony Krowiak <akrowiak@...ux.ibm.com>
>>> Reviewed-by: Halil Pasic <pasic@...ux.ibm.com>
>>> Tested-by: Michael Mueller <mimu@...ux.ibm.com>
>>> Tested-by: Farhan Ali <alifm@...ux.ibm.com>
>>> Signed-off-by: Christian Borntraeger <borntraeger@...ibm.com>
>>> ---
>>>  drivers/s390/crypto/vfio_ap_ops.c |   44 +++++++++++++++++++++++++++++++++++++
>>>  1 files changed, 44 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
>>> index f8b276a..48b1b78 100644
>>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>>> @@ -829,6 +829,49 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
>>>  	return NOTIFY_OK;
>>>  }
>>>  
>>> +static int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
>>> +				    unsigned int retry)
>>> +{
>>> +	struct ap_queue_status status;
>>> +
>>> +	do {
>>> +		status = ap_zapq(AP_MKQID(apid, apqi));
>>> +		switch (status.response_code) {
>>> +		case AP_RESPONSE_NORMAL:
>>> +			return 0;
>>> +		case AP_RESPONSE_RESET_IN_PROGRESS:
>>> +		case AP_RESPONSE_BUSY:
>>> +			msleep(20);
>>> +			break;
>>> +		default:
>>> +			/* things are really broken, give up */
>>> +			return -EIO;
>>> +		}
>>> +	} while (retry--);
>>> +
>>> +	return -EBUSY;
>> So, this function may either return 0, -EIO (things are really broken),
>> or -EBUSY (still busy after multiple tries)...
>>
>>> +}
>>> +
>>> +static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev)
>>> +{
>>> +	int ret;
>>> +	int rc = 0;
>>> +	unsigned long apid, apqi;
>>> +	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>>> +
>>> +	for_each_set_bit_inv(apid, matrix_mdev->matrix.apm,
>>> +			     matrix_mdev->matrix.apm_max + 1) {
>>> +		for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm,
>>> +				     matrix_mdev->matrix.aqm_max + 1) {
>>> +			ret = vfio_ap_mdev_reset_queue(apid, apqi, 1);
>>> +			if (ret)
>>> +				rc = ret;
>> ...and here, we return the last error of any of the resets. Two
>> questions:
>>
>> - Does it make sense to continue if we get -EIO? IOW, does "really
>>   broken" only refer to a certain tuple and other tuples still can/need
>>   to be reset?
> I think it does make sense to continue, because IMHO "things are really
> broken" is an overstatement (I mean the APQN invalid case). One could
> argue would skipping the current card (adapter) be justified or not.
>
> IMHO the current code is good enough for the first shot, and we can think
> about fine-tuning it later.
Absolutely. The -EIO case is reached for example when the APQN
is 'deconfigured' which means the crypto adapter is logically unplugged.
So the -EIO case should NOT lead to some fatal actions like panic()
or cause a KVM guest to shut down or so.
>> - Is the return code useful in any way, as we don't know which tuple it
>>   refers to?
>>
> Well, good question. It conveys that the operation can 'fail'. AFAIR -EBUSY
> is mostly fine given what the architecture say if we are satisfied with just
> reset. And the cases behind -EIO might actually be OK too in the same sense.
> My guess is, that based on the return value client code can tell if we have
> zeroize for all queues or basically just reset (like rapq). We could log that
> to some debug facility or whatever -- I guess, but at the moment we don't care.
>
> In the end I think the code is good enough as is, and if we want we can
> improve on it later.
>
> Regards,
> Halil
>
>
>>> +		}
>>> +	}
>>> +
>>> +	return rc;
>>> +}
>>> +
>>>  static int vfio_ap_mdev_open(struct mdev_device *mdev)
>>>  {
>>>  	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>>> @@ -859,6 +902,7 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
>>>  	if (matrix_mdev->kvm)
>>>  		kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
>>>  
>>> +	vfio_ap_mdev_reset_queues(mdev);
>>>  	vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
>>>  				 &matrix_mdev->group_notifier);
>>>  	matrix_mdev->kvm = NULL;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ