[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3a1863b1-f0f2-92a6-09c4-eab1faafcd41@linux.ibm.com>
Date: Mon, 9 Jan 2023 14:40:28 -0500
From: Anthony Krowiak <akrowiak@...ux.ibm.com>
To: Halil Pasic <pasic@...ux.ibm.com>
Cc: linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org,
kvm@...r.kernel.org, jjherne@...ux.ibm.com, freude@...ux.ibm.com,
borntraeger@...ibm.com, cohuck@...hat.com, mjrosato@...ux.ibm.com,
alex.williamson@...hat.com, kwankhede@...dia.com,
fiuczy@...ux.ibm.com
Subject: Re: [PATCH 7/7] s390/vfio_ap: always clean up IRQ resources
On 12/20/22 12:24 PM, Halil Pasic wrote:
> On Tue, 20 Dec 2022 09:33:03 -0500
> Anthony Krowiak <akrowiak@...ux.ibm.com> wrote:
>
>> On 12/19/22 9:10 AM, Halil Pasic wrote:
>>> On Tue, 13 Dec 2022 10:44:37 -0500
>>> Tony Krowiak <akrowiak@...ux.ibm.com> wrote:
>>>
>>>> Clean up IRQ resources even when a PQAP(ZAPQ) function fails with an error
>>>> not handled by a case statement.
>>> Why?
>>
>> If the ZAPQ failed, then instructions submitted to the same queue will
>> likewise fail. Are you saying it's not safe to assume, therefore, that
>> interrupts will not be occurring?
> Right. We are talking about the default branch here, and I suppose, the
> codes where we know that it is safe to assume that no reset is needed
> handled separately (AP_RESPONSE_DECONFIGURED).
>
> I'm not convinced that if we take the default branch we can safely
> assume, that we won't see any interrupts.
>
> For example consider hot-unplug as done by KVM. We modify the
> CRYCB/APCB with all vCPUS take out of SIE, but we don't keep
> the vCPUs out of SIE until the resets of the unpugged queues
> are done, and we don't do any extra interrupt disablement
> with all vCPUs keept out of SIE. So I believe currently there
> may be a window where the guest can observe a 01 but the
> interrupts are still live. That may be a bug, but IMHO it ain't clear
> cut.
>
> But it is not just about interrupts. Before we returned an error
> code, which gets propagated to the userspace if this reset was
> triggered via the ioctl.
>
> With this change, ret seems to be uninitialized when returned
> if we take the code path which you change here. So we would
> end up logging a warning and returning garbage?
That was an oversight. The -EIO value was returned previously, so the
ret = -EIO should be set in the default case.
>
> One could also debate, whether RCs introduced down the road
> can affect the logic here (even if the statement "if we
> see an RC other that 00 and 02, we don't need to pursue a
> reset any further, and interrpts are disabled" were to be
> guaranteed to be true now, new RCs could theoretically mess
> this up).
I think that would be the case regardless of this change. If new RCs are
introduced, this function ought to be revisited anyway and appropriate
changes made.
>
>
>>
>>> I'm afraid this is a step in the wrong direction...
>>
>> Please explain why.
>>
> Sorry, I kept this brief because IMHO it is your job to tell us why
> this needs to be changed. But I gave in, as you see.
>
> Regards,
> Halil
Powered by blists - more mailing lists