linux-kernel - Re: [PATCH 7/7] s390/vfio_ap: always clean up IRQ resources

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20221220182407.5959a4b6.pasic@linux.ibm.com>
Date:   Tue, 20 Dec 2022 18:24:07 +0100
From:   Halil Pasic <pasic@...ux.ibm.com>
To:     Anthony Krowiak <akrowiak@...ux.ibm.com>
Cc:     linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org, jjherne@...ux.ibm.com, freude@...ux.ibm.com,
        borntraeger@...ibm.com, cohuck@...hat.com, mjrosato@...ux.ibm.com,
        alex.williamson@...hat.com, kwankhede@...dia.com,
        fiuczy@...ux.ibm.com, Halil Pasic <pasic@...ux.ibm.com>
Subject: Re: [PATCH 7/7] s390/vfio_ap: always clean up IRQ resources

On Tue, 20 Dec 2022 09:33:03 -0500
Anthony Krowiak <akrowiak@...ux.ibm.com> wrote:

> On 12/19/22 9:10 AM, Halil Pasic wrote:
> > On Tue, 13 Dec 2022 10:44:37 -0500
> > Tony Krowiak <akrowiak@...ux.ibm.com> wrote:
> >  
> >> Clean up IRQ resources even when a PQAP(ZAPQ) function fails with an error
> >> not handled by a case statement.  
> > Why?  
> 
> 
> If the ZAPQ failed, then instructions submitted to the same queue will 
> likewise fail. Are you saying it's not safe to assume, therefore, that 
> interrupts will not be occurring?

Right. We are talking about the default branch here, and I suppose, the
codes where we know that it is safe to assume that no reset is needed
handled separately (AP_RESPONSE_DECONFIGURED).

I'm not convinced that if we take the default branch we can safely
assume, that we won't see any interrupts.

For example consider hot-unplug as done by KVM. We modify the
CRYCB/APCB with all vCPUS take out of SIE, but we don't keep
the vCPUs out of SIE until the resets of the unpugged queues
are done, and we don't do any extra interrupt disablement
with all vCPUs keept out of SIE. So I believe currently there
may be a window where the guest can observe a 01 but the
interrupts are still live. That may be a bug, but IMHO it ain't clear
cut.

But it is not just about interrupts. Before we returned an error
code, which gets propagated to the userspace if this reset was
triggered via the ioctl.

With this change, ret seems to be uninitialized when returned 
if we take the code path which you change here. So we would
end up logging a warning and returning garbage?

One could also debate, whether RCs introduced down the road
can affect the logic here (even if the statement "if we
see an RC other that 00 and 02, we don't need to pursue a
reset any further, and interrpts are disabled" were to be
guaranteed to be true now, new RCs could theoretically mess
this up).

> 
> 
> >
> > I'm afraid this is a step in the wrong direction...  
> 
> 
> Please explain why.
> 

Sorry, I kept this brief because IMHO it is your job to tell us why
this needs to be changed. But I gave in, as you see.

Regards,
Halil