[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e65af3fd-e7c8-bd9b-75ff-f2d990338221@huawei.com>
Date: Tue, 10 Mar 2020 11:33:17 +0000
From: John Garry <john.garry@...wei.com>
To: Thomas Gleixner <tglx@...utronix.de>, Marc Zyngier <maz@...nel.org>
CC: <linux-kernel@...r.kernel.org>,
Jason Cooper <jason@...edaemon.net>,
"Ming Lei" <ming.lei@...hat.com>,
"chenxiang (M)" <chenxiang66@...ilicon.com>
Subject: Re: [PATCH] irqchip/gic-v3-its: Balance initial LPI affinity across
CPUs
On 20/01/2020 19:24, Thomas Gleixner wrote:
> Marc,
>
> Marc Zyngier <maz@...nel.org> writes:
>> We're stuck between a rock and a hard place here:
>>
>> (1) We place all interrupts on the least loaded CPU that matches
>> the affinity -> results in performance issues on some funky
>> HW (like D05's SAS controller).
>>
>> (2) We place managed interrupts on the least loaded CPU that matches
>> the affinity -> we have artificial load on NUMA boundaries, and
>> reduced spread of overlapping managed interrupts.
>>
>> (3) We don't account for non-managed LPIs, and we run the risk of
>> unpredictable performance because we don't really know where
>> the *other* interrupts are.
>>
>> My personal preference would be to go for (1), as in my original post.
>> I find (3) the least appealing, because we don't track things anymore.
>> (2) feels like "the least of all evils", as it is a decent performance
>> gain, seems to give predictable performance, and doesn't regress lesser
>> systems...
>>
>> I'm definitely open to suggestions here.
>
> The way x86 does it and that's mostly ok except for some really broken
> setups is:
>
> 1) Non-managed interrupts:
>
> If the interrupt is bound to a node, then we try to find a target
>
> I) in the intersection of affinity mask and node mask.
>
> II) in the nodemask itself
>
> Yes we ignore affinity mask there because that's pretty much
> the same as if the given affinity does not contain an online
> CPU.
>
> If all of that fails then we try the nodeless mode
>
> If the interrupt is not bound to a node, then we try to find a target
>
> I) in the intersection of affinity mask and online mask.
>
> II) in the onlinemask itself
>
> Each step searches for the CPU in the searched mask which has the
> least number of total interrupts assigned.
>
> 2) Managed interrupts
>
> For managed interrupts we just search in the intersection of assigned
> mask and online CPUs for the CPU with the least number of managed
> interrupts.
>
> If no CPU is online then the interrupt is shutdown anyway, so no
> fallback required.
>
> Don't know whether that's something you can map to ARM64, but I assume
> the principle of trying to enforce NUMA locality plus balancing the
> number of interrupts makes sense in general.
>
Hi Marc,
I was wondering if there is anything we can do to progress this patch?
Apart from being a good change in itself, I need to do some SMMU testing
for nextgen product development and I would like to include this patch,
most preferably from mainline.
Cheers,
John
Powered by blists - more mailing lists