[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <864j7cybay.wl-maz@kernel.org>
Date: Thu, 22 Aug 2024 13:47:01 +0100
From: Marc Zyngier <maz@...nel.org>
To: Kunkun Jiang <jiangkunkun@...wei.com>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Oliver Upton
<oliver.upton@...ux.dev>,
James Morse <james.morse@....com>,
Suzuki K Poulose
<suzuki.poulose@....com>,
Zenghui Yu <yuzenghui@...wei.com>,
"open list:IRQ\
SUBSYSTEM" <linux-kernel@...r.kernel.org>,
"moderated list:ARM SMMU DRIVERS"
<linux-arm-kernel@...ts.infradead.org>,
<kvmarm@...ts.linux.dev>,
"wanghaibin.wang@...wei.com" <wanghaibin.wang@...wei.com>,
<nizhiqiang1@...wei.com>,
"tangnianyao@...wei.com" <tangnianyao@...wei.com>,
<wangzhou1@...ilicon.com>
Subject: Re: [bug report] GICv4.1: multiple vpus execute vgic_v4_load at the same time will greatly increase the time consumption
On Thu, 22 Aug 2024 11:59:50 +0100,
Kunkun Jiang <jiangkunkun@...wei.com> wrote:
>
> Hi Marc,
>
> On 2024/8/22 16:26, Marc Zyngier wrote:
> >>>> According to analysis, this problem is due to the execution of vgic_v4_load.
> >>>> vcpu_load or kvm_sched_in
> >>>> kvm_arch_vcpu_load
> >>>> ...
> >>>> vgic_v4_load
> >>>> irq_set_affinity
> >>>> ...
> >>>> irq_do_set_affinity
> >>>> raw_spin_lock(&tmp_mask_lock)
> >>>> chip->irq_set_affinity
> >>>> ...
> >>>> its_vpe_set_affinity
> >>>>
> >>>> The tmp_mask_lock is the key. This is a global lock. I don't quite
> >>>> understand
> >>>> why tmp_mask_lock is needed here. I think there are two possible
> >>>> solutions here:
> >>>> 1. Remove this tmp_mask_lock
> >>>
> >>> Maybe you could have a look at 33de0aa4bae98 (and 11ea68f553e24)? It
> >>> would allow you to understand the nature of the problem.
> >>>
> >>> This can probably be replaced with a per-CPU cpumask, which would
> >>> avoid the locking, but potentially result in a larger memory usage.
> >>
> >> Thanks, I will try it.
> >
> > A simple alternative would be this:
> >
> > diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> > index dd53298ef1a5..0d11b74af38c 100644
> > --- a/kernel/irq/manage.c
> > +++ b/kernel/irq/manage.c
> > @@ -224,15 +224,12 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
> > struct irq_desc *desc = irq_data_to_desc(data);
> > struct irq_chip *chip = irq_data_get_irq_chip(data);
> > const struct cpumask *prog_mask;
> > + struct cpumask tmp_mask = {};
> > int ret;
> > - static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
> > - static struct cpumask tmp_mask;
> > -
> > if (!chip || !chip->irq_set_affinity)
> > return -EINVAL;
> > - raw_spin_lock(&tmp_mask_lock);
> > /*
> > * If this is a managed interrupt and housekeeping is enabled on
> > * it check whether the requested affinity mask intersects with
> > @@ -280,8 +277,6 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
> > else
> > ret = -EINVAL;
> > - raw_spin_unlock(&tmp_mask_lock);
> > -
> > switch (ret) {
> > case IRQ_SET_MASK_OK:
> > case IRQ_SET_MASK_OK_DONE:
> >
> > but that will eat a significant portion of your stack if your kernel is
> > configured for a large number of CPUs.
> >
>
> Currently CONFIG_NR_CPUS=4096,each `struct cpumask` occupies 512 bytes.
This seems crazy. Why would you build a kernel with something *that*
big, specially considering that you have a lot less than 1k CPUs?
[...]
> > The removal of this global lock is the only option in my opinion.
> > Either the cpumask becomes a stack variable, or it becomes a static
> > per-CPU variable. Both have drawbacks, but they are not a bottleneck
> > anymore.
>
> I also prefer to remove the global lock. Which variable do you think is
> better?
Given the number of CPUs your system is configured for, there is no
good answer. An on-stack variable is dangerously large, and a per-CPU
cpumask results in 2MB being allocated, which I find insane.
You'll have to pick your own poison and convince Thomas of the
validity of your approach.
M.
--
Without deviation from the norm, progress is not possible.
Powered by blists - more mailing lists