[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <83eb55b0-2f2d-3335-85cf-6d7ed379b3c7@huawei.com>
Date: Mon, 20 Jan 2020 18:21:02 +0000
From: John Garry <john.garry@...wei.com>
To: Marc Zyngier <maz@...nel.org>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Thomas Gleixner" <tglx@...utronix.de>,
Jason Cooper <jason@...edaemon.net>,
Ming Lei <ming.lei@...hat.com>,
"chenxiang (M)" <chenxiang66@...ilicon.com>
Subject: Re: [PATCH] irqchip/gic-v3-its: Balance initial LPI affinity across
CPUs
On 20/01/2020 17:42, Marc Zyngier wrote:
Hi Marc,
>>> static u64 its_irq_get_msi_base(struct its_device *its_dev)
>>> @@ -2773,28 +2829,34 @@ static int its_irq_domain_activate(struct
>>> irq_domain *domain,
>>> {
>>> struct its_device *its_dev = irq_data_get_irq_chip_data(d);
>>> u32 event = its_get_event_id(d);
>>> - const struct cpumask *cpu_mask = cpu_online_mask;
>>> - int cpu;
>>> + int ret = 0, cpu = nr_cpu_ids;
>>> + const struct cpumask *reqmask;
>>> + cpumask_var_t mask;
>>> - /* get the cpu_mask of local node */
>>> - if (its_dev->its->numa_node >= 0)
>>> - cpu_mask = cpumask_of_node(its_dev->its->numa_node);
>>> + if (irqd_affinity_is_managed(d))
>>> + reqmask = irq_data_get_affinity_mask(d);
>>> + else
>>> + reqmask = cpu_online_mask;
>>> - /* Bind the LPI to the first possible CPU */
>>> - cpu = cpumask_first_and(cpu_mask, cpu_online_mask);
>>> - if (cpu >= nr_cpu_ids) {
>>> - if (its_dev->its->flags & ITS_FLAGS_WORKAROUND_CAVIUM_23144)
>>> - return -EINVAL;
>>> + if (!alloc_cpumask_var(&mask, GFP_KERNEL))
>>> + return -ENOMEM;
>>> - cpu = cpumask_first(cpu_online_mask);
>>> + its_compute_affinity(d, reqmask, mask);
>>> + cpu = its_pick_target_cpu(mask);
>>> + if (cpu >= nr_cpu_ids) {
>>> + ret = -EINVAL;
>>> + goto out;
>>> }
>>> + atomic_inc(per_cpu_ptr(&cpu_lpi_count, cpu));
>>
>> I wonder if we should only consider managed interrupts in this
>> accounting?
>>
>> So cpu0 is effectively going to be excluded from the balancing, as it
>> will have so many lpis targeted.
>
> Maybe, but only if the provided managed affinity gives you the
> opportunity of placing the LPI somewhere else.
Of course, if there's no other cpu in the mask then so be it.
If the managed
> affinity says CPU0 only, then that's where you end up.
>
If my debug code is correct (with the above fix), cpu0 had 763
interrupts targeted on my D06 initially :)
But it's not just cpu0. I find initial non-managed interrupt affinity
masks are set generally on cpu cluster/numa node masks, so the first
cpus in those masks are bit over-subscribed, so then we may be spreading
the managed interrupts over less cpus in the mask.
This is a taste of lpi distribution on my 96 core system:
cpu0 763
cpu1 2
cpu3 1
cpu4 2
cpu5 2
cpu6 0
cpu7 0
cpu8 2
cpu9 1
cpu10 0
...
cpu16 2
...
cpu24 8
...
cpu48 10 (numa node boundary)
...
>> And, for the others, even if we balance all the LPIs, won't irqbalance
>> (if running, obviously) can come along and fiddle with these
>> non-managed interrupt affinities anyway?
>
> Of course, irqbalance will move things around. But that should be to
> CPUs that do not have too many screaming interrupts.
>
> Thanks,
>
> M.
Cheers,
John
>
Powered by blists - more mailing lists