lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <efe052679277adf38f0b843420fec7657ed3dc37.camel@amazon.com>
Date:   Tue, 23 May 2023 11:59:43 +0000
From:   "Gowans, James" <jgowans@...zon.com>
To:     "tglx@...utronix.de" <tglx@...utronix.de>,
        "maz@...nel.org" <maz@...nel.org>,
        "liaochang1@...wei.com" <liaochang1@...wei.com>
CC:     "zouyipeng@...wei.com" <zouyipeng@...wei.com>,
        "Raslan, KarimAllah" <karahmed@...zon.com>,
        "Woodhouse, David" <dwmw@...zon.co.uk>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "chris.zjh@...wei.com" <chris.zjh@...wei.com>
Subject: Re: [PATCH] irq: fasteoi handler re-runs on concurrent invoke

On Tue, 2023-05-23 at 11:15 +0800, liaochang (A) wrote:
> > 1. Do we need to mask the IRQ and then unmask it later? I don't think so
> > but it's not entirely clear why handle_edge_irq does this anyway; it's
> > an edge IRQ so not sure why it needs to be masked.
> 
> 
> In the GIC architecture, a CPU that is handling an IRQ cannot be signaled
> with a new interrupt until it satisfies the interrupt preemption requirements
> defined in the GIC architecture. One of these requirements is that the
> priority of the new pending interrupt is higher than the priority of the
> current running interrupt. Obviously, the same interrupt source cannot
> preempt itself. Additionally, interrupt priority is rarely enabled in
> the Linux kernel,except for the PESUDO_NMI.
> 
> If the interrupt is an LPI, and interrupt handling is still in progress on
> the original CPU, the lack of the ACITVE state means that the LPI can be
> distributed to another CPU once its affinity has been changed. On the other
> hand, since the desc->istate is marked IRQS_PENDING and this interrupt has
> been consumed on the new CPU, there is no need to mask it and then unmask it later.

Thanks, this makes sense to me and matches my understanding. What I was
actually unsure about was why handle_edge_irq() *does* do the masking. I
guess it's because handle_edge_irq() runs desc->irq_data.chip->irq_ack(),
and perhaps if that isn't run (in the concurrent invoke case) then the IRQ
will keep getting re-triggered? Wild speculation. :-)

> > 2. Should the EOI delivery be inside the do loop after every handler
> > run? I think outside the loops is best; only inform the chip to deliver
> > more IRQs once all pending runs have been finished.
> 
> The GIC architecture requires that writes to the EOI register be in the exact
> reverse order of reads from the IAR register. Therefore, IAR and EOI must be paired.
> Writing EOI inside the do loop after every handler may cause subtle problems.
> 
Ah ha! (where is this documented, btw?) And because the IAR read happens
really early in (for example) __gic_handle_irq_from_irqson() it only makes
sense to call EOI once when returning from the interrupt context. That
also means that in the concurrent invoke case we would need to run EOI
even if the handler was not invoked.

> > 3. Do we need to check that desc->action is still set? I don't know if
> > it can be un-set while the IRQ is still enabled.
> 
> This check is necessary here. When the code enters this critical section, the kernel
> running on another CPU might have already unregistered the IRQ action via the free_irq API.
> Although free_irq is typically used in code paths called on module unload or exception
> handling, we have also observed that virtualization using VFIO as a PCI backend
> frequently intends to use free_irq in some regular code paths.
> 

Okay, I'm a bit hazy on whether it is or should be possible to unregister
the IRQ action while the handler is still running - it sounds like you're
saying this is possible and safe. More code spelunking would be necessary
to verify this but I won't bother seeing as it looks like the solution
we're tending towards uses check_irq_resend(), instead of invoking the
handler directly.

JG

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ