[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <2549bc2f-4e5a-cc24-b976-f771a7243cc8@linux.ibm.com>
Date: Tue, 7 May 2019 11:12:22 -0400
From: Tony Krowiak <akrowiak@...ux.ibm.com>
To: pmorel@...ux.ibm.com, linux-s390@...r.kernel.org,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Cc: freude@...ux.ibm.com, borntraeger@...ibm.com, cohuck@...hat.com,
frankja@...ux.ibm.com, david@...hat.com, schwidefsky@...ibm.com,
heiko.carstens@...ibm.com, pasic@...ux.ibm.com,
alex.williamson@...hat.com, kwankhede@...dia.com
Subject: Re: [PATCH v2 1/7] s390: vfio-ap: wait for queue empty on queue reset
On 5/7/19 4:10 AM, Pierre Morel wrote:
> On 06/05/2019 21:37, Tony Krowiak wrote:
>> On 5/6/19 2:41 AM, Pierre Morel wrote:
>>> On 03/05/2019 23:14, Tony Krowiak wrote:
>>>> Refactors the AP queue reset function to wait until the queue is empty
>>>> after the PQAP(ZAPQ) instruction is executed to zero out the queue as
>>>> required by the AP architecture.
>>>>
>>>> Signed-off-by: Tony Krowiak <akrowiak@...ux.ibm.com>
>>>> ---
>>>> drivers/s390/crypto/vfio_ap_ops.c | 35
>>>> ++++++++++++++++++++++++++++++++---
>>>> 1 file changed, 32 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c
>>>> b/drivers/s390/crypto/vfio_ap_ops.c
>>>> index 900b9cf20ca5..b88a2a2ba075 100644
>>>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>>>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>>>> @@ -271,6 +271,32 @@ static int
>>>> vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev)
>>>> return 0;
>>>> }
>>>> +static void vfio_ap_mdev_wait_for_qempty(unsigned long apid,
>>>> unsigned long apqi)
>>>> +{
>>>> + struct ap_queue_status status;
>>>> + ap_qid_t qid = AP_MKQID(apid, apqi);
>>>> + int retry = 5;
>>>> +
>>>> + do {
>>>> + status = ap_tapq(qid, NULL);
>>>> + switch (status.response_code) {
>>>> + case AP_RESPONSE_NORMAL:
>>>> + if (status.queue_empty)
>>>> + return;
>>>> + msleep(20);
>>>
>>> NIT: Fall through ?
>>
>> Yes
>>
>>>
>>>> + break;
>>>> + case AP_RESPONSE_RESET_IN_PROGRESS:
>>>> + case AP_RESPONSE_BUSY:
>>>> + msleep(20);
>>>> + break;
>>>> + default:
>>>> + pr_warn("%s: tapq err %02x: %04lx.%02lx may not be
>>>> empty\n",
>>>> + __func__, status.response_code, apid, apqi);
>>>
>>> I do not thing the warning sentence is appropriate:
>>> The only possible errors here are if the AP is not available due to
>>> AP checkstop, deconfigured AP or invalid APQN.
>>
>> Right you are! I'll work on a new message.
>>
>>>
>>>
>>>> + return;
>>>> + }
>>>> + } while (--retry);
>>>> +}
>>>> +
>>>> /**
>>>> * assign_adapter_store
>>>> *
>>>> @@ -790,15 +816,18 @@ static int vfio_ap_mdev_group_notifier(struct
>>>> notifier_block *nb,
>>>> return NOTIFY_OK;
>>>> }
>>>> -static int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int
>>>> apqi,
>>>> - unsigned int retry)
>>>> +int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi)
>>>> {
>>>> struct ap_queue_status status;
>>>> + int retry = 5;
>>>> do {
>>>> status = ap_zapq(AP_MKQID(apid, apqi));
>>>> switch (status.response_code) {
>>>> case AP_RESPONSE_NORMAL:
>>>> + vfio_ap_mdev_wait_for_qempty(apid, apqi);
>>>> + return 0;
>>>> + case AP_RESPONSE_DECONFIGURED:
>>>
>>> Since you modify the switch, you can return for all the following cases:
>>> AP_RESPONSE_DECONFIGURE
>>> ..._CHECKSTOP
>>> ..._INVALID_APQN
>>>
>>>
>>> And you should wait for qempty on AP_RESET_IN_PROGRESS along with
>>> AP_RESPONSE_NORMAL
>>
>> If a queue reset is in progress, we retry the zapq. Are you saying we
>> should wait for qempty then reissue the zapq?
>
>
> Yes, I fear that if we reissue the zapq while RESET is in progress we
> could fall in a loop depending on the reset hardware time and the
> software retry .
I already did this in the forthcoming v4 series.
>
>>
>>>
>>>> return 0;
>>>> case AP_RESPONSE_RESET_IN_PROGRESS:
>>>> case AP_RESPONSE_BUSY:
>>>
>>> While at modifying this function, the AP_RESPONSE_BUSY is not a valid
>>> code for ZAPQ, you can remove this.
>>
>> Okay
>>
>>>
>>>> @@ -824,7 +853,7 @@ static int vfio_ap_mdev_reset_queues(struct
>>>> mdev_device *mdev)
>>>> matrix_mdev->matrix.apm_max + 1) {
>>>> for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm,
>>>> matrix_mdev->matrix.aqm_max + 1) {
>>>> - ret = vfio_ap_mdev_reset_queue(apid, apqi, 1);
>>>> + ret = vfio_ap_mdev_reset_queue(apid, apqi);
>>>
>>> IMHO, since you are at changing this call, passing the apqn as
>>> parameter would be a good simplification.
>>
>> Okay.
>
> Sorry, I should have add: NIT.
>
>>
>>>
>>>
>>>
>>>> /*
>>>> * Regardless whether a queue turns out to be busy, or
>>>> * is not operational, we need to continue resetting
>>>
>>> Depends on why the reset failed, but this is out of scope.
>>
>> I'm not sure what you mean by out of scope here, but you do make a valid
>> point. If the response code for the zapq is AP_RESPONSE_DECONFIGURED,
>> there is probably no sense in continuing to reset queues for that
>> particular adapter. I'll consider a change here.
>
> Yes, this was the point, but I consider this as a enhancement, trying a
> reset on bad queues AFAIK do no arm.
I included the enhancement in the forthcoming v4 series.
>
>>
>>>
>>>>
>>>
>>>
>>
>
>
Powered by blists - more mailing lists