linux-kernel - Re: [PATCH v3 3/3] genirq: Use the maple tree for IRQ descriptors management

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87mt2f2mhm.ffs@tglx>
Date:   Mon, 08 May 2023 11:36:37 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Yujie Liu <yujie.liu@...el.com>
Cc:     Shanker Donthineni <sdonthineni@...dia.com>,
        oe-lkp@...ts.linux.dev, lkp@...el.com,
        linux-kernel@...r.kernel.org, Marc Zyngier <maz@...nel.org>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Michael Walle <michael@...le.cc>,
        Vikram Sethi <vsethi@...dia.com>
Subject: Re: [PATCH v3 3/3] genirq: Use the maple tree for IRQ descriptors
 management

Yujie!

On Sun, May 07 2023 at 16:05, Yujie Liu wrote:
> Sorry for late reply as we were on public holiday earlier this week.

Holidays are more important and the problems do not run away :)

> On Fri, Apr 28, 2023 at 12:31:14PM +0200, Thomas Gleixner wrote:
>> Under the assumption that the code is correct, then the effect of this
>> patch is that it changes the timing. Sigh.
>> 
>>   1) Does this happen with a 64-bit kernel too?
>
> It doesn't happen on a 64-bit kernel:

Ok. So one difference might be that a 64 bit kernel enables interrupt
rempping. Can you add 'intremap=off' to the kernel command line please?

>>   2) Can you enable the irq_vector:vector_*.* tracepoints and provide
>>      the trace?
>
> I'm a beginner of kernel and not sure if I'm doing this correctly. Here
> are my test steps:

They are perfectly fine.

> # check the trace
> # cat /sys/kernel/debug/tracing/trace
> # tracer: nop
> #
> # entries-in-buffer/entries-written: 0/0   #P:4
> #
> #                                _-----=> irqs-off/BH-disabled
> #                               / _----=> need-resched
> #                              | / _---=> hardirq/softirq
> #                              || / _--=> preempt-depth
> #                              ||| / _-=> migrate-disable
> #                              |||| /     delay
> #           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
> #              | |         |   |||||     |         |
>
> Nothing was written to trace buffer, seems like no irq_vector events
> were captured during this test.

Stupid me. I completely forgot that this happens on the outgoing CPU at
a point where the tracer for that CPU is already shut down.

Can you please apply the patch below? No need to enable the irq_vector
events. It just dumps the information into dmesg.

Thanks,

        tglx
---
--- a/kernel/irq/cpuhotplug.c
+++ b/kernel/irq/cpuhotplug.c
@@ -57,7 +57,8 @@ static bool migrate_one_irq(struct irq_d
 	bool maskchip = !irq_can_move_pcntxt(d) && !irqd_irq_masked(d);
 	const struct cpumask *affinity;
 	bool brokeaff = false;
-	int err;
+	int err, irq = d->irq;
+	bool move_pending;
 
 	/*
 	 * IRQ chip might be already torn down, but the irq descriptor is
@@ -101,10 +102,16 @@ static bool migrate_one_irq(struct irq_d
 	 * there is no move pending or the pending mask does not contain
 	 * any online CPU, use the current affinity mask.
 	 */
-	if (irq_fixup_move_pending(desc, true))
+	move_pending = irqd_is_setaffinity_pending(d);
+	if (irq_fixup_move_pending(desc, true)) {
 		affinity = irq_desc_get_pending_mask(desc);
-	else
+		pr_info("IRQ %3d: move_pending=%d pending mask: %*pbl\n",
+			irq, move_pending, cpumask_pr_args(affinity));
+	} else {
 		affinity = irq_data_get_affinity_mask(d);
+		pr_info("IRQ %3d: move_pending=%d affinity mask: %*pbl\n",
+			irq, move_pending, cpumask_pr_args(affinity));
+	}
 
 	/* Mask the chip for interrupts which cannot move in process context */
 	if (maskchip && chip->irq_mask)
@@ -136,6 +143,9 @@ static bool migrate_one_irq(struct irq_d
 		brokeaff = false;
 	}
 
+	affinity = irq_data_get_effective_affinity_mask(d);
+	pr_info("IRQ %3d: Done: %*pbl\n", irq, cpumask_pr_args(affinity));
+
 	if (maskchip && chip->irq_unmask)
 		chip->irq_unmask(d);