[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4279cab9-9b36-e83d-bd7a-ff7cd2832054@suse.com>
Date: Mon, 8 Feb 2021 14:58:11 +0100
From: Jürgen Groß <jgross@...e.com>
To: Julien Grall <julien@....org>, xen-devel@...ts.xenproject.org,
linux-kernel@...r.kernel.org, linux-block@...r.kernel.org,
netdev@...r.kernel.org, linux-scsi@...r.kernel.org
Cc: Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Stefano Stabellini <sstabellini@...nel.org>,
stable@...r.kernel.org,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
Roger Pau Monné <roger.pau@...rix.com>,
Jens Axboe <axboe@...nel.dk>, Wei Liu <wei.liu@...nel.org>,
Paul Durrant <paul@....org>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH 0/7] xen/events: bug fixes and some diagnostic aids
On 08.02.21 14:09, Julien Grall wrote:
> Hi Juergen,
>
> On 08/02/2021 12:31, Jürgen Groß wrote:
>> On 08.02.21 13:16, Julien Grall wrote:
>>>
>>>
>>> On 08/02/2021 12:14, Jürgen Groß wrote:
>>>> On 08.02.21 11:40, Julien Grall wrote:
>>>>> Hi Juergen,
>>>>>
>>>>> On 08/02/2021 10:22, Jürgen Groß wrote:
>>>>>> On 08.02.21 10:54, Julien Grall wrote:
>>>>>>> ... I don't really see how the difference matter here. The idea
>>>>>>> is to re-use what's already existing rather than trying to
>>>>>>> re-invent the wheel with an extra lock (or whatever we can come up).
>>>>>>
>>>>>> The difference is that the race is occurring _before_ any IRQ is
>>>>>> involved. So I don't see how modification of IRQ handling would help.
>>>>>
>>>>> Roughly our current IRQ handling flow (handle_eoi_irq()) looks like:
>>>>>
>>>>> if ( irq in progress )
>>>>> {
>>>>> set IRQS_PENDING
>>>>> return;
>>>>> }
>>>>>
>>>>> do
>>>>> {
>>>>> clear IRQS_PENDING
>>>>> handle_irq()
>>>>> } while (IRQS_PENDING is set)
>>>>>
>>>>> IRQ handling flow like handle_fasteoi_irq() looks like:
>>>>>
>>>>> if ( irq in progress )
>>>>> return;
>>>>>
>>>>> handle_irq()
>>>>>
>>>>> The latter flow would catch "spurious" interrupt and ignore them.
>>>>> So it would handle nicely the race when changing the event affinity.
>>>>
>>>> Sure? Isn't "irq in progress" being reset way before our "lateeoi" is
>>>> issued, thus having the same problem again?
>>>
>>> Sorry I can't parse this.
>>
>> handle_fasteoi_irq() will do nothing "if ( irq in progress )". When is
>> this condition being reset again in order to be able to process another
>> IRQ?
> It is reset after the handler has been called. See handle_irq_event().
Right. And for us this is too early, as we want the next IRQ being
handled only after we have called xen_irq_lateeoi().
>
>> I believe this will be the case before our "lateeoi" handling is
>> becoming active (more precise: when our IRQ handler is returning to
>> handle_fasteoi_irq()), resulting in the possibility of the same race we
>> are experiencing now.
>
> I am a bit confused what you mean by "lateeoi" handling is becoming
> active. Can you clarify?
See above: the next call of the handler should be allowed only after
xen_irq_lateeoi() for the IRQ has been called.
If the handler is being called earlier we have the race resulting
in the WARN() splats.
> Note that are are other IRQ flows existing. We should have a look at
> them before trying to fix thing ourself.
Fine with me, but it either needs to fit all use cases (interdomain,
IPI, real interrupts) or we need to have a per-type IRQ flow.
I think we should fix the issue locally first, then we can start to do
a thorough rework planning. Its not as if the needed changes with the
current flow would be so huge, and I'd really like to have a solution
rather sooner than later. Changing the IRQ flow might have other side
effects which need to be excluded by thorough testing.
> Although, the other issue I can see so far is handle_irq_for_port() will
> update info->{eoi_cpu, irq_epoch, eoi_time} without any locking. But it
> is not clear this is what you mean by "becoming active".
As long as a single event can't be handled on multiple cpus at the same
time, there is no locking needed.
Juergen
Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3092 bytes)
Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)
Powered by blists - more mailing lists