linux-kernel - Re: [PATCH] genirq/msi: Shutdown managed interrupts with unsatifiable affinities

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87mthsfjai.ffs@tglx>
Date:   Mon, 14 Mar 2022 20:03:49 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Marc Zyngier <maz@...nel.org>
Cc:     linux-kernel@...r.kernel.org, John Garry <john.garry@...wei.com>,
        David Decotigny <ddecotig@...gle.com>
Subject: Re: [PATCH] genirq/msi: Shutdown managed interrupts with
 unsatifiable affinities

On Mon, Mar 14 2022 at 16:00, Marc Zyngier wrote:
> On Mon, 14 Mar 2022 15:27:10 +0000,
> Thomas Gleixner <tglx@...utronix.de> wrote:
>> 
>> On Mon, Mar 07 2022 at 19:06, Marc Zyngier wrote:
>> > When booting with maxcpus=<small number>, interrupt controllers
>> > such as the GICv3 ITS may not be able to satisfy the affinity of
>> > some managed interrupts, as some of the HW resources are simply
>> > not available.
>> 
>> This is also true if you have offlined lots of CPUs, right?
>
> Not quite. If you offline the CPUs, the interrupts will be placed in
> the shutdown state as expected, having initially transitioned via an
> activation state with an online CPU. The issue here is with the
> initial activation of the interrupt, which currently happens even if
> no matching CPU is present.

Yes. But if you load the driver _after_ offlining lots of CPUs first
then the same thing should happen, right?

>> > +		/*
>> > +		 * If the interrupt is managed but no CPU is available
>> > +		 * to service it, shut it down until better times.
>> > +		 */
>> > +		if ((vflags & VIRQ_ACTIVATE) &&
>> > +		    irqd_affinity_is_managed(irqd) &&
>> > +		    !cpumask_intersects(irq_data_get_affinity_mask(irqd),
>> > +					cpu_online_mask)) {
>> > +			    irqd_set_managed_shutdown(irqd);
>> 
>> Hrm. Why is this in the !CAN_RESERVE path and not before the actual
>> activation call?
>
> VIRQ_CAN_RESERVE can only happen as a consequence of
> GENERIC_IRQ_RESERVATION_MODE, which only exists on x86. Given that x86
> is already super careful not to activate an interrupt that is not
> immediately required, I though we could avoid putting this check on
> that path.
>
> But if I got the above wrong (which is, let's face it, extremely
> likely), I'm happy to kick it down the road next to the activation
> call.

I just rechecked. Yes, we could push it there, but actually on x86 the
reservation mode activation sets the entry to a spurious catch all on an
online CPU, which is intentional.

So yes, we can keep it where it is now, but that needs a comment.

Thanks,

        tglx