[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <063eff75-56a5-1af7-f684-a2ed4b13c9a7@xen.org>
Date: Mon, 8 Feb 2021 13:09:27 +0000
From: Julien Grall <julien@....org>
To: Jürgen Groß <jgross@...e.com>,
xen-devel@...ts.xenproject.org, linux-kernel@...r.kernel.org,
linux-block@...r.kernel.org, netdev@...r.kernel.org,
linux-scsi@...r.kernel.org
Cc: Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Stefano Stabellini <sstabellini@...nel.org>,
stable@...r.kernel.org,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
Roger Pau Monné <roger.pau@...rix.com>,
Jens Axboe <axboe@...nel.dk>, Wei Liu <wei.liu@...nel.org>,
Paul Durrant <paul@....org>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH 0/7] xen/events: bug fixes and some diagnostic aids
Hi Juergen,
On 08/02/2021 12:31, Jürgen Groß wrote:
> On 08.02.21 13:16, Julien Grall wrote:
>>
>>
>> On 08/02/2021 12:14, Jürgen Groß wrote:
>>> On 08.02.21 11:40, Julien Grall wrote:
>>>> Hi Juergen,
>>>>
>>>> On 08/02/2021 10:22, Jürgen Groß wrote:
>>>>> On 08.02.21 10:54, Julien Grall wrote:
>>>>>> ... I don't really see how the difference matter here. The idea is
>>>>>> to re-use what's already existing rather than trying to re-invent
>>>>>> the wheel with an extra lock (or whatever we can come up).
>>>>>
>>>>> The difference is that the race is occurring _before_ any IRQ is
>>>>> involved. So I don't see how modification of IRQ handling would help.
>>>>
>>>> Roughly our current IRQ handling flow (handle_eoi_irq()) looks like:
>>>>
>>>> if ( irq in progress )
>>>> {
>>>> set IRQS_PENDING
>>>> return;
>>>> }
>>>>
>>>> do
>>>> {
>>>> clear IRQS_PENDING
>>>> handle_irq()
>>>> } while (IRQS_PENDING is set)
>>>>
>>>> IRQ handling flow like handle_fasteoi_irq() looks like:
>>>>
>>>> if ( irq in progress )
>>>> return;
>>>>
>>>> handle_irq()
>>>>
>>>> The latter flow would catch "spurious" interrupt and ignore them. So
>>>> it would handle nicely the race when changing the event affinity.
>>>
>>> Sure? Isn't "irq in progress" being reset way before our "lateeoi" is
>>> issued, thus having the same problem again?
>>
>> Sorry I can't parse this.
>
> handle_fasteoi_irq() will do nothing "if ( irq in progress )". When is
> this condition being reset again in order to be able to process another
> IRQ?
It is reset after the handler has been called. See handle_irq_event().
> I believe this will be the case before our "lateeoi" handling is
> becoming active (more precise: when our IRQ handler is returning to
> handle_fasteoi_irq()), resulting in the possibility of the same race we
> are experiencing now.
I am a bit confused what you mean by "lateeoi" handling is becoming
active. Can you clarify?
Note that are are other IRQ flows existing. We should have a look at
them before trying to fix thing ourself.
Although, the other issue I can see so far is handle_irq_for_port() will
update info->{eoi_cpu, irq_epoch, eoi_time} without any locking. But it
is not clear this is what you mean by "becoming active".
Cheers,
--
Julien Grall
Powered by blists - more mailing lists