[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6a004f9d-bcfd-378c-8b62-051e1f4e09b4@huawei.com>
Date: Fri, 6 Dec 2019 16:16:36 +0000
From: John Garry <john.garry@...wei.com>
To: Marc Zyngier <maz@...nel.org>
CC: <tglx@...utronix.de>, <chenxiang66@...ilicon.com>,
<bigeasy@...utronix.de>, <linux-kernel@...r.kernel.org>,
<hare@...e.com>, <ming.lei@...hat.com>, <hch@....de>,
<axboe@...nel.dk>, <bvanassche@....org>, <peterz@...radead.org>,
<mingo@...hat.com>
Subject: Re: [PATCH RFC 1/1] genirq: Make threaded handler use irq affinity
for managed interrupt
On 06/12/2019 15:22, Marc Zyngier wrote:
Hi Marc,
>
> On 2019-12-06 14:35, John Garry wrote:
>> Currently the cpu allowed mask for the threaded part of a threaded irq
>> handler will be set to the effective affinity of the hard irq.
>>
>> Typically the effective affinity of the hard irq will be for a single
>> cpu. As such,
>> the threaded handler would always run on the same cpu as the hard irq.
>>
>> We have seen scenarios in high data-rate throughput testing that the cpu
>> handling the interrupt can be totally saturated handling both the hard
>> interrupt and threaded handler parts, limiting throughput.
>>
>> For when the interrupt is managed, allow the threaded part to run on all
>> cpus in the irq affinity mask.
>>
>> Signed-off-by: John Garry <john.garry@...wei.com>
>> ---
>> kernel/irq/manage.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
>> index 1753486b440c..8e7f8e758a88 100644
>> --- a/kernel/irq/manage.c
>> +++ b/kernel/irq/manage.c
>> @@ -968,7 +968,11 @@ irq_thread_check_affinity(struct irq_desc *desc,
>> struct irqaction *action)
>> if (cpumask_available(desc->irq_common_data.affinity)) {
>> const struct cpumask *m;
>>
>> - m = irq_data_get_effective_affinity_mask(&desc->irq_data);
>> + if (irqd_affinity_is_managed(&desc->irq_data))
>> + m = desc->irq_common_data.affinity;
>> + else
>> + m = irq_data_get_effective_affinity_mask(
>> + &desc->irq_data);
>> cpumask_copy(mask, m);
>> } else {
>> valid = false;
>
> Although I completely understand that there are cases where you
> really want to let your thread roam all CPUs, I feel like changing
> this based on a seemingly unrelated property is likely to trigger
> yet another whack-a-mole episode. I'd feel much more comfortable
> if there was a way to let the IRQ subsystem know about what is best.
>
> Shouldn't the endpoint driver know better about it?
I did propose that same idea here:
https://lore.kernel.org/lkml/fd7d6101-37f4-2d34-f2f7-cfeade610278@huawei.com/
And that fits my agenda to get best throughput figures, while not
possibly affecting others.
But it seems that we could do better to make this a common policy: allow
the threaded part to roam when that CPU is overloaded, but how...?
Note that
> I have no data supporting an approach or the other, hence playing
> the role of the village idiot here.
>
Understood. My data is that we get an ~11% throughput boost for our
storage test with this change.
> Thanks,
>
> M.
Thanks,
John
Powered by blists - more mailing lists