linux-kernel - Re: [PATCH] genirq/msi: Shutdown managed interrupts with unsatifiable affinities

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <87fsnjzgxg.wl-maz@kernel.org>
Date:   Tue, 15 Mar 2022 09:46:51 +0000
From:   Marc Zyngier <maz@...nel.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     linux-kernel@...r.kernel.org, John Garry <john.garry@...wei.com>,
        David Decotigny <ddecotig@...gle.com>
Subject: Re: [PATCH] genirq/msi: Shutdown managed interrupts with unsatifiable affinities

On Mon, 14 Mar 2022 19:03:49 +0000,
Thomas Gleixner <tglx@...utronix.de> wrote:
> 
> On Mon, Mar 14 2022 at 16:00, Marc Zyngier wrote:
> > On Mon, 14 Mar 2022 15:27:10 +0000,
> > Thomas Gleixner <tglx@...utronix.de> wrote:
> >> 
> >> On Mon, Mar 07 2022 at 19:06, Marc Zyngier wrote:
> >> > When booting with maxcpus=<small number>, interrupt controllers
> >> > such as the GICv3 ITS may not be able to satisfy the affinity of
> >> > some managed interrupts, as some of the HW resources are simply
> >> > not available.
> >> 
> >> This is also true if you have offlined lots of CPUs, right?
> >
> > Not quite. If you offline the CPUs, the interrupts will be placed in
> > the shutdown state as expected, having initially transitioned via an
> > activation state with an online CPU. The issue here is with the
> > initial activation of the interrupt, which currently happens even if
> > no matching CPU is present.
> 
> Yes. But if you load the driver _after_ offlining lots of CPUs first
> then the same thing should happen, right?

Ah! yes, that's the exact same problem (modular drivers? that's an
idea that will never catch on...).

> 
> >> > +		/*
> >> > +		 * If the interrupt is managed but no CPU is available
> >> > +		 * to service it, shut it down until better times.
> >> > +		 */
> >> > +		if ((vflags & VIRQ_ACTIVATE) &&
> >> > +		    irqd_affinity_is_managed(irqd) &&
> >> > +		    !cpumask_intersects(irq_data_get_affinity_mask(irqd),
> >> > +					cpu_online_mask)) {
> >> > +			    irqd_set_managed_shutdown(irqd);
> >> 
> >> Hrm. Why is this in the !CAN_RESERVE path and not before the actual
> >> activation call?
> >
> > VIRQ_CAN_RESERVE can only happen as a consequence of
> > GENERIC_IRQ_RESERVATION_MODE, which only exists on x86. Given that x86
> > is already super careful not to activate an interrupt that is not
> > immediately required, I though we could avoid putting this check on
> > that path.
> >
> > But if I got the above wrong (which is, let's face it, extremely
> > likely), I'm happy to kick it down the road next to the activation
> > call.
> 
> I just rechecked. Yes, we could push it there, but actually on x86 the
> reservation mode activation sets the entry to a spurious catch all on an
> online CPU, which is intentional.
> 
> So yes, we can keep it where it is now, but that needs a comment.

Yup, I'll add that.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.