netdev - Re: [PATCH 0/7] xen/events: bug fixes and some diagnostic aids

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7aaeeb3d-1e1b-6166-84e9-481153811b62@suse.com>
Date:   Mon, 8 Feb 2021 10:41:00 +0100
From:   Jürgen Groß <jgross@...e.com>
To:     Julien Grall <julien@....org>, xen-devel@...ts.xenproject.org,
        linux-kernel@...r.kernel.org, linux-block@...r.kernel.org,
        netdev@...r.kernel.org, linux-scsi@...r.kernel.org
Cc:     Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Stefano Stabellini <sstabellini@...nel.org>,
        stable@...r.kernel.org,
        Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
        Roger Pau Monné <roger.pau@...rix.com>,
        Jens Axboe <axboe@...nel.dk>, Wei Liu <wei.liu@...nel.org>,
        Paul Durrant <paul@....org>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH 0/7] xen/events: bug fixes and some diagnostic aids

On 08.02.21 10:11, Julien Grall wrote:
> Hi Juergen,
> 
> On 07/02/2021 12:58, Jürgen Groß wrote:
>> On 06.02.21 19:46, Julien Grall wrote:
>>> Hi Juergen,
>>>
>>> On 06/02/2021 10:49, Juergen Gross wrote:
>>>> The first three patches are fixes for XSA-332. The avoid WARN splats
>>>> and a performance issue with interdomain events.
>>>
>>> Thanks for helping to figure out the problem. Unfortunately, I still 
>>> see reliably the WARN splat with the latest Linux master 
>>> (1e0d27fce010) + your first 3 patches.
>>>
>>> I am using Xen 4.11 (1c7d984645f9) and dom0 is forced to use the 2L 
>>> events ABI.
>>>
>>> After some debugging, I think I have an idea what's went wrong. The 
>>> problem happens when the event is initially bound from vCPU0 to a 
>>> different vCPU.
>>>
>>>  From the comment in xen_rebind_evtchn_to_cpu(), we are masking the 
>>> event to prevent it being delivered on an unexpected vCPU. However, I 
>>> believe the following can happen:
>>>
>>> vCPU0                | vCPU1
>>>                  |
>>>                  | Call xen_rebind_evtchn_to_cpu()
>>> receive event X            |
>>>                  | mask event X
>>>                  | bind to vCPU1
>>> <vCPU descheduled>        | unmask event X
>>>                  |
>>>                  | receive event X
>>>                  |
>>>                  | handle_edge_irq(X)
>>> handle_edge_irq(X)        |  -> handle_irq_event()
>>>                  |   -> set IRQD_IN_PROGRESS
>>>   -> set IRQS_PENDING        |
>>>                  |   -> evtchn_interrupt()
>>>                  |   -> clear IRQD_IN_PROGRESS
>>>                  |  -> IRQS_PENDING is set
>>>                  |  -> handle_irq_event()
>>>                  |   -> evtchn_interrupt()
>>>                  |     -> WARN()
>>>                  |
>>>
>>> All the lateeoi handlers expect a ONESHOT semantic and 
>>> evtchn_interrupt() is doesn't tolerate any deviation.
>>>
>>> I think the problem was introduced by 7f874a0447a9 ("xen/events: fix 
>>> lateeoi irq acknowledgment") because the interrupt was disabled 
>>> previously. Therefore we wouldn't do another iteration in 
>>> handle_edge_irq().
>>
>> I think you picked the wrong commit for blaming, as this is just
>> the last patch of the three patches you were testing.
> 
> I actually found the right commit for blaming but I copied the 
> information from the wrong shell :/. The bug was introduced by:
> 
> c44b849cee8c ("xen/events: switch user event channels to lateeoi model")
> 
>>
>>> Aside the handlers, I think it may impact the defer EOI mitigation 
>>> because in theory if a 3rd vCPU is joining the party (let say vCPU A 
>>> migrate the event from vCPU B to vCPU C). So info->{eoi_cpu, 
>>> irq_epoch, eoi_time} could possibly get mangled?
>>>
>>> For a fix, we may want to consider to hold evtchn_rwlock with the 
>>> write permission. Although, I am not 100% sure this is going to 
>>> prevent everything.
>>
>> It will make things worse, as it would violate the locking hierarchy
>> (xen_rebind_evtchn_to_cpu() is called with the IRQ-desc lock held).
> 
> Ah, right.
> 
>>
>> On a first glance I think we'll need a 3rd masking state ("temporarily
>> masked") in the second patch in order to avoid a race with lateeoi.
>>
>> In order to avoid the race you outlined above we need an "event is being
>> handled" indicator checked via test_and_set() semantics in
>> handle_irq_for_port() and reset only when calling clear_evtchn().
> 
> It feels like we are trying to workaround the IRQ flow we are using 
> (i.e. handle_edge_irq()).

I'm not really sure this is the main problem here. According to your
analysis the main problem is occurring when handling the event, not when
handling the IRQ: the event is being received on two vcpus.

Our problem isn't due to the IRQ still being pending, but due it being
raised again, which should happen for a one shot IRQ the same way.

But maybe I'm misunderstanding your idea.


Juergen

Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3092 bytes)

Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)