[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <52C1ABDD.6050302@redhat.com>
Date: Mon, 30 Dec 2013 12:22:37 -0500
From: Prarit Bhargava <prarit@...hat.com>
To: rui wang <ruiv.wang@...il.com>
CC: Tony Luck <tony.luck@...il.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, X86-ML <x86@...nel.org>,
Michel Lespinasse <walken@...gle.com>,
Andi Kleen <ak@...ux.intel.com>,
Seiji Aguchi <seiji.aguchi@....com>,
Yang Zhang <yang.z.zhang@...el.com>,
Paul Gortmaker <paul.gortmaker@...driver.com>,
janet.morgan@...el.com, "Yu, Fenghua" <fenghua.yu@...el.com>,
chen gong <gong.chen@...ux.intel.com>
Subject: Re: [PATCH] x86: Add check for number of available vectors before
CPU down [v2]
On 12/30/2013 07:56 AM, rui wang wrote:
> On 12/29/13, Prarit Bhargava <prarit@...hat.com> wrote:
>>
>>
>> On 12/20/2013 04:41 AM, rui wang wrote:
> <<snip>>
>>> The vector number for an irq is programmed in the LSB of the IOAPIC
>>> IRTE (or MSI data register in the case of MSI/MSIx). So there can be
>>> only one vector number (although multiple CPUs can be specified
>>> through DM). An MSI-capable device can dynamically change the lower
>>> few bits in the LSB to signal multiple interrupts with a contiguous
>>> range of vectors in powers of 2,but each of these vectors is treated
>>> as a separate IRQ. i.e. each of them has a separate irq desc, or a
>>> separate line in the /proc/interrupt file. This patch shows the MSI
>>> irq allocation in detail:
>>> http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=51906e779f2b13b38f8153774c4c7163d412ffd9
>>>
>>> Thanks
>>> Rui
>>>
>>
>> Gong and Rui,
>>
>> After looking at this in detail I realized I made a mistake in my patch by
>> including the check for the smp_affinity. Simply put, it shouldn't be
>> there
>> given Rui's explanation above.
>>
>> So I think the patch simply needs to do:
>>
>> this_count = 0;
>> for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++)
>> {
>> irq = __this_cpu_read(vector_irq[vector]);
>> if (irq >= 0) {
>> desc = irq_to_desc(irq);
>> data = irq_desc_get_irq_data(desc);
>> affinity = data->affinity;
>> if (irq_has_action(irq) && !irqd_is_per_cpu(data))
>> this_count++;
>> }
>> }
>>
>> Can the two of you confirm the above is correct? It would be greatly
>> appreciated.
>
> An irq can be mapped to only one vector number, but can have multiple
> destination CPUs. i.e. the same irq/vector can appear on multiple
> CPUs' vector_irq[]. So checking data->affinity is necessary I think.
> But notice that data->affinity is updated in chip->irq_set_affinity()
> inside fixup_irqs(), while cpu_online_mask is updated in
> remove_cpu_from_maps() inside cpu_disable_common(). They are updated
> in different places. So the algorithm to check them against each other
> should be different, depending on where you put the check_vectors().
> That's my understanding.
Okay, so the big issue is that we need to do the calculation without this cpu,
so I think this works (sorry for the cut-and-paste)
int check_irq_vectors_for_cpu_disable(void)
{
int irq, cpu;
unsigned int vector, this_count, count;
struct irq_desc *desc;
struct irq_data *data;
struct cpumask online_new; /* cpu_online_mask - this_cpu */
struct cpumask affinity_new; /* affinity - this_cpu */
cpumask_copy(&online_new, cpu_online_mask);
cpu_clear(smp_processor_id(), online_new);
this_count = 0;
for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) {
irq = __this_cpu_read(vector_irq[vector]);
if (irq >= 0) {
desc = irq_to_desc(irq);
data = irq_desc_get_irq_data(desc);
cpumask_copy(&affinity_new, data->affinity);
cpu_clear(smp_processor_id(), affinity_new);
if (irq_has_action(irq) && !irqd_is_per_cpu(data) &&
!cpumask_subset(&affinity_new, &online_new) &&
!cpumask_empty(&affinity_new))
this_count++;
}
}
...
If I go back to the various examples this appears to work. For example, your
previous case was all cpus are online, CPU 1 goes down and we have an IRQ with
affinity for CPU (1,2). We skip this IRQ which is correct.
And if we have another IRQ with affinity of only CPU 1 we will not skip this
IRQ, which is also correct.
I've tried other examples and they appear to work AFAICT.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists