lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAAhSdy2TV29T2pJ_GXmYqWDvuupBMimxs06wFHnz6_jwvAbvcQ@mail.gmail.com>
Date: Wed, 15 Jan 2025 22:31:37 +0530
From: Anup Patel <anup@...infault.org>
To: Xu Lu <luxu.kernel@...edance.com>
Cc: daniel.lezcano@...aro.org, tglx@...utronix.de, paul.walmsley@...ive.com, 
	palmer@...belt.com, lihangjing@...edance.com, xieyongji@...edance.com, 
	linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [External] Re: [PATCH 1/5] irqchip/riscv-intc: Balance priority
 and fairness during irq handling

On Wed, Jan 15, 2025 at 6:08 PM Xu Lu <luxu.kernel@...edance.com> wrote:
>
> Hi Anup,
>
> I agree that a standardized NMI or SSE event is the optimal solution
> to the watchdog problem. Here I just use buddy watchdog to illustrate
> the possible consequences of interrupts with low priority getting
> starved.
>
> I believe that AIA or firmware can adjust the priority of different
> irqs. What I want to emphasize is that in existing implementation,
> kernel will continue handling newly arrived high priority irqs, no
> matter how long lower priority irqs have been waiting for.
>
> For example, if both external irq and IPI (assuming non AIA
> architecture using software irq) arrive in cycle 0, kernel will handle
> the external irq first. Then a new external irq arrives during the
> first external irq handling procedure, let's say cycle 100. Then
> kernel finishes the first external irq handling procedure, let's say
> cycle 200, it will handle the newly arrived external irq, instead of
> the already arrived IPI. The IPI won't be handled until the second
> external irq is handled, let's say cycle 300. Things get worse if
> external interrupts keep arriving.
>
> I think it is better to provide a mechanism to avoid this. So I regard
> interrupts arriving within a certain period as a round and only handle
> interrupts in the new round after all interrupts in the old round have
> been handled. The interrupt priority only takes effect in the same
> round.

Well, this flood of external interrupts from a device is a classical
problem across architectures. The best way to solve this is improve
the device driver to use better bottom-half techniques such as
NAPI in-case of network driver, threaded IRQs, etc.

Working around this in the interrupt controller driver is not the
right way.

Regards,
Anup

>
> Of course this is not the optimal way. Please give some advice if you
> also think it is necessary (for example introduce an interrupt
> priority decreasing mechanism instead).
>
> Best Regards,
>
> Xu Lu
>
> On Wed, Jan 15, 2025 at 7:39 PM Anup Patel <anup@...infault.org> wrote:
> >
> > On Mon, Jan 13, 2025 at 8:39 PM Xu Lu <luxu.kernel@...edance.com> wrote:
> > >
> > > Both csr cause and csr topi record the pending bit with the highest
> > > priority. If interrupts with high priority arrive frequently within a
> > > certain period of time, the interrupts with low priority won't get a
> > > chance to be handled.
> > >
> > > For example, if external interrupts and software interrupts arrive very
> > > frequently, the timer interrupts will never be handled. Then buddy
> > > watchdog on a buddy CPU will report a hardlockup on the current CPU while
> > > current CPU actually can receive irq.
> >
> > Platform with proper HW watchdog will not see this issue because HW
> > watchdog interrupt will be an external interrupt.
> >
> > There was an effort to standardize watchdog for RISC-V platforms but it was
> > deferred to future standardization. We should resume those discussions within
> > RVI forums.
> >
> > >
> > > This commit solves this problem by handling all pending irqs in a round.
> > > During each round, this commit handles pending irqs by their priority.
> > >
> > > Signed-off-by: Xu Lu <luxu.kernel@...edance.com>
> > > ---
> > >  drivers/irqchip/irq-riscv-intc.c | 32 ++++++++++++++++++++++++++------
> > >  1 file changed, 26 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
> > > index f653c13de62b..bc2ec26aa9e9 100644
> > > --- a/drivers/irqchip/irq-riscv-intc.c
> > > +++ b/drivers/irqchip/irq-riscv-intc.c
> > > @@ -26,20 +26,40 @@ static unsigned int riscv_intc_nr_irqs __ro_after_init = BITS_PER_LONG;
> > >  static unsigned int riscv_intc_custom_base __ro_after_init = BITS_PER_LONG;
> > >  static unsigned int riscv_intc_custom_nr_irqs __ro_after_init;
> > >
> > > +static unsigned int riscv_prio_irqs[] = {
> > > +#ifdef CONFIG_RISCV_M_MODE
> > > +       IRQ_M_EXT, IRQ_M_SOFT, IRQ_M_TIMER,
> > > +#endif
> > > +       IRQ_S_EXT, IRQ_S_SOFT, IRQ_S_TIMER, IRQ_S_GEXT,
> > > +       IRQ_VS_EXT, IRQ_VS_SOFT, IRQ_VS_TIMER,
> > > +       IRQ_PMU_OVF,
> > > +};
> > > +
> > >  static void riscv_intc_irq(struct pt_regs *regs)
> > >  {
> > > -       unsigned long cause = regs->cause & ~CAUSE_IRQ_FLAG;
> > > -
> > > -       if (generic_handle_domain_irq(intc_domain, cause))
> > > -               pr_warn_ratelimited("Failed to handle interrupt (cause: %ld)\n", cause);
> > > +       unsigned long pending = csr_read(CSR_IP) & csr_read(CSR_IE);
> > > +       unsigned int i;
> > > +
> > > +       for (i = 0; i < ARRAY_SIZE(riscv_prio_irqs); i++)
> > > +               if (pending & (1UL << riscv_prio_irqs[i]))
> > > +                       if (generic_handle_domain_irq(intc_domain, riscv_prio_irqs[i]))
> > > +                               pr_warn_ratelimited("Failed to handle interrupt (cause: %u)\n",
> > > +                                                   riscv_prio_irqs[i]);
> >
> > This is overriding the builtin priority assignment for local interrupts as
> > defined by the RISC-V privileged specification for non-AIA systems
> > which changes the expected behaviour on existing platforms.
> >
> > >  }
> > >
> > >  static void riscv_intc_aia_irq(struct pt_regs *regs)
> > >  {
> > >         unsigned long topi;
> > > +       unsigned long pending;
> > > +       unsigned int i;
> > > +
> > > +       while ((topi = csr_read(CSR_TOPI))) {
> > > +               pending = csr_read(CSR_IP) & csr_read(CSR_IE);
> > >
> > > -       while ((topi = csr_read(CSR_TOPI)))
> > > -               generic_handle_domain_irq(intc_domain, topi >> TOPI_IID_SHIFT);
> > > +               for (i = 0; i < ARRAY_SIZE(riscv_prio_irqs); i++)
> > > +                       if (pending & (1UL << riscv_prio_irqs[i]))
> > > +                               generic_handle_domain_irq(intc_domain, riscv_prio_irqs[i]);
> > > +       }
> >
> > The AIA specification already provides a mechanism to change
> > priorities of local interrupts. The firmware or bootloader can always
> > provide custom priorities to local interrupts.
> >
> > In general, requiring certain local interrupts to have higher priority is
> > platform or use-case specific and should be done as AIA configuration
> > in firmware or bootloader.
> >
> > NAK to this patch from my side since it is just adding a software
> > work-around for a missing standard watchdog in the RISC-V
> > ecosystem.
> >
> > One possible approach is to let platforms have their implementation
> > specific M-mode watchdog and for supervisor software (both HS and
> > VS-mode) we can have SBI based watchdog where the supervisor
> > watchdog expiry is an SSE (higher priority than all local interrupts).
> >
> > Regards,
> > Anup

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ