lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ilsgzfpv.wl-maz@kernel.org>
Date:   Mon, 14 Mar 2022 16:00:44 +0000
From:   Marc Zyngier <maz@...nel.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     linux-kernel@...r.kernel.org, John Garry <john.garry@...wei.com>,
        David Decotigny <ddecotig@...gle.com>
Subject: Re: [PATCH] genirq/msi: Shutdown managed interrupts with unsatifiable affinities

On Mon, 14 Mar 2022 15:27:10 +0000,
Thomas Gleixner <tglx@...utronix.de> wrote:
> 
> On Mon, Mar 07 2022 at 19:06, Marc Zyngier wrote:
> > When booting with maxcpus=<small number>, interrupt controllers
> > such as the GICv3 ITS may not be able to satisfy the affinity of
> > some managed interrupts, as some of the HW resources are simply
> > not available.
> 
> This is also true if you have offlined lots of CPUs, right?

Not quite. If you offline the CPUs, the interrupts will be placed in
the shutdown state as expected, having initially transitioned via an
activation state with an online CPU. The issue here is with the
initial activation of the interrupt, which currently happens even if
no matching CPU is present.

> 
> > In order to deal with this, do not try to activate such interrupt
> > if there is no online CPU capable of handling it. Instead, place
> > it in shutdown state. Once a capable CPU shows up, it will be
> > activated.
> >
> > Reported-by: John Garry <john.garry@...wei.com>
> > Reported-by: David Decotigny <ddecotig@...gle.com>
> > Signed-off-by: Marc Zyngier <maz@...nel.org>
> > ---
> >  kernel/irq/msi.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
> > index 2bdfce5edafd..aa84ce84c2ec 100644
> > --- a/kernel/irq/msi.c
> > +++ b/kernel/irq/msi.c
> > @@ -818,6 +818,18 @@ static int msi_init_virq(struct irq_domain *domain, int virq, unsigned int vflag
> >  		irqd_clr_can_reserve(irqd);
> >  		if (vflags & VIRQ_NOMASK_QUIRK)
> >  			irqd_set_msi_nomask_quirk(irqd);
> > +
> > +		/*
> > +		 * If the interrupt is managed but no CPU is available
> > +		 * to service it, shut it down until better times.
> > +		 */
> > +		if ((vflags & VIRQ_ACTIVATE) &&
> > +		    irqd_affinity_is_managed(irqd) &&
> > +		    !cpumask_intersects(irq_data_get_affinity_mask(irqd),
> > +					cpu_online_mask)) {
> > +			    irqd_set_managed_shutdown(irqd);
> 
> Hrm. Why is this in the !CAN_RESERVE path and not before the actual
> activation call?

VIRQ_CAN_RESERVE can only happen as a consequence of
GENERIC_IRQ_RESERVATION_MODE, which only exists on x86. Given that x86
is already super careful not to activate an interrupt that is not
immediately required, I though we could avoid putting this check on
that path.

But if I got the above wrong (which is, let's face it, extremely
likely), I'm happy to kick it down the road next to the activation
call.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ