[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1hbyxv8ep.fsf@fess.ebiederm.org>
Date: Wed, 03 Jun 2009 04:55:26 -0700
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Gary Hade <garyhade@...ibm.com>
Cc: mingo@...e.hu, mingo@...hat.com, linux-kernel@...r.kernel.org,
tglx@...utronix.de, hpa@...or.com, x86@...nel.org,
yinghai@...nel.org, lcm@...ibm.com
Subject: Re: [RESEND] [PATCH v2] [BUGFIX] x86/x86_64: fix CPU offlining triggered "inactive" device IRQ interrruption
Gary Hade <garyhade@...ibm.com> writes:
> Impact: Eliminates a race that can leave the system in an
> unusable state
>
> During rapid offlining of multiple CPUs there is a chance
> that an IRQ affinity move destination CPU will be offlined
> before the IRQ affinity move initiated during the offlining
> of a previous CPU completes. This can happen when the device
> is not very active and thus fails to generate the IRQ that is
> needed to complete the IRQ affinity move before the move
> destination CPU is offlined. When this happens there is an
> -EBUSY return from __assign_irq_vector() during the offlining
> of the IRQ move destination CPU which prevents initiation of
> a new IRQ affinity move operation to an online CPU. This
> leaves the IRQ affinity set to an offlined CPU.
>
> I have been able to reproduce the problem on some of our
> systems using the following script. When the system is idle
> the problem often reproduces during the first CPU offlining
> sequence.
Nacked-by: "Eric W. Biederman" <ebiederm@...ssion.com>
fixup_irqs() is broken for allowing such a thing.
> #!/bin/sh
>
> SYS_CPU_DIR=/sys/devices/system/cpu
> VICTIM_IRQ=25
> IRQ_MASK=f0
>
> iteration=0
> while true; do
> echo $iteration
> echo $IRQ_MASK > /proc/irq/$VICTIM_IRQ/smp_affinity
> for cpudir in $SYS_CPU_DIR/cpu[1-9] $SYS_CPU_DIR/cpu??; do
> echo 0 > $cpudir/online
> done
> for cpudir in $SYS_CPU_DIR/cpu[1-9] $SYS_CPU_DIR/cpu??; do
> echo 1 > $cpudir/online
> done
> iteration=`expr $iteration + 1`
> done
>
> The proposed fix takes advantage of the fact that when all
> CPUs in the old domain are offline there is nothing to be done
> by send_cleanup_vector() during the affinity move completion.
> So, we simply avoid setting cfg->move_in_progress preventing
> the above mentioned -EBUSY return from __assign_irq_vector().
> This allows initiation of a new IRQ affinity move to a CPU
> that is not going offline.
>
> Successfully tested with Ingo's linux-2.6-tip (32 and 64-bit
> builds) on the IBM x460, x3550 M2, x3850, and x3950 M2.
>
> v2: modified to integrate with Yinghai Lu's
> "x86/irq: remove leftover code from NUMA_MIGRATE_IRQ_DESC"
> patch which modified intersecting lines. Only comment
> changes were affected. The actual change to the code
> is the same.
>
> Signed-off-by: Gary Hade <garyhade@...ibm.com>
>
> ---
> arch/x86/kernel/apic/io_apic.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> Index: linux-2.6-tip/arch/x86/kernel/apic/io_apic.c
> ===================================================================
> --- linux-2.6-tip.orig/arch/x86/kernel/apic/io_apic.c 2009-05-14 14:06:30.000000000 -0700
> +++ linux-2.6-tip/arch/x86/kernel/apic/io_apic.c 2009-05-14 14:09:42.000000000 -0700
> @@ -1218,8 +1218,11 @@ next:
> current_vector = vector;
> current_offset = offset;
> if (old_vector) {
> - cfg->move_in_progress = 1;
> cpumask_copy(cfg->old_domain, cfg->domain);
> + if (cpumask_intersects(cfg->old_domain,
> + cpu_online_mask)) {
> + cfg->move_in_progress = 1;
> + }
> }
> for_each_cpu_and(new_cpu, tmp_mask, cpu_online_mask)
> per_cpu(vector_irq, new_cpu)[vector] = irq;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists