lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 13 Apr 2024 12:11:20 +0100
From: Marc Zyngier <maz@...nel.org>
To: Dawei Li <dawei.li@...ngroup.cn>
Cc: tglx@...utronix.de,
	yury.norov@...il.com,
	akpm@...ux-foundation.org,
	florian.fainelli@...adcom.com,
	chenhuacai@...nel.org,
	jiaxun.yang@...goat.com,
	anup@...infault.org,
	palmer@...belt.com,
	samuel.holland@...ive.com,
	linux@...musvillemoes.dk,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/6] irqchip/gic-v3-its: Avoid explicit cpumask allocation on stack

On Sat, 13 Apr 2024 11:29:20 +0100,
Dawei Li <dawei.li@...ngroup.cn> wrote:
> 
> Hi Marc,
> 
> Thanks for the review.
> 
> On Fri, Apr 12, 2024 at 02:53:32PM +0100, Marc Zyngier wrote:
> > On Fri, 12 Apr 2024 11:58:36 +0100,
> > Dawei Li <dawei.li@...ngroup.cn> wrote:
> > > 
> > > In general it's preferable to avoid placing cpumasks on the stack, as
> > > for large values of NR_CPUS these can consume significant amounts of
> > > stack space and make stack overflows more likely.
> > >
> > > Remove cpumask var on stack and use proper cpumask API to address it.
> > 
> > Define proper. Or better, define what is "improper" about the current
> > usage.
> 
> Sorry for the confusion.
> 
> I didn't mean current implementation is 'improper', actually both
> implementations share equivalent API usages. I will remove this
> misleading expression from commit message.
> 
> > 
> > >
> > > Signed-off-by: Dawei Li <dawei.li@...ngroup.cn>
> > > ---
> > >  drivers/irqchip/irq-gic-v3-its.c | 9 ++++++---
> > >  1 file changed, 6 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > > index fca888b36680..a821396c4261 100644
> > > --- a/drivers/irqchip/irq-gic-v3-its.c
> > > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > > @@ -3826,7 +3826,7 @@ static int its_vpe_set_affinity(struct irq_data *d,
> > >  				bool force)
> > >  {
> > >  	struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
> > > -	struct cpumask common, *table_mask;
> > > +	struct cpumask *table_mask;
> > >  	unsigned long flags;
> > >  	int from, cpu;
> > >  
> > > @@ -3850,8 +3850,11 @@ static int its_vpe_set_affinity(struct irq_data *d,
> > >  	 * If we are offered another CPU in the same GICv4.1 ITS
> > >  	 * affinity, pick this one. Otherwise, any CPU will do.
> > >  	 */
> > > -	if (table_mask && cpumask_and(&common, mask_val, table_mask))
> > > -		cpu = cpumask_test_cpu(from, &common) ? from : cpumask_first(&common);
> > > +	if (table_mask && cpumask_intersects(mask_val, table_mask)) {
> > > +		cpu = cpumask_test_cpu(from, mask_val) &&
> > > +		      cpumask_test_cpu(from, table_mask) ?
> > > +		      from : cpumask_first_and(mask_val, table_mask);
> > 
> > So we may end-up computing the AND of the two bitmaps twice (once for
> > cpumask_intersects(), once for cpumask_first_and()), instead of only
> > doing it once.
> 
> Actually maybe it's possible to merge these 2 bitmap ops into one:
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index fca888b36680..7a267777bd0b 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -3826,7 +3826,8 @@ static int its_vpe_set_affinity(struct irq_data *d,
>                                 bool force)
>  {
>         struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
> -       struct cpumask common, *table_mask;
> +       struct cpumask *table_mask;
> +       unsigned int common;
>         unsigned long flags;
>         int from, cpu;
> 
> @@ -3850,10 +3851,13 @@ static int its_vpe_set_affinity(struct irq_data *d,
>          * If we are offered another CPU in the same GICv4.1 ITS
>          * affinity, pick this one. Otherwise, any CPU will do.
>          */
> -       if (table_mask && cpumask_and(&common, mask_val, table_mask))
> -               cpu = cpumask_test_cpu(from, &common) ? from : cpumask_first(&common);
> -       else
> +       if (table_mask && (common = cpumask_first_and(mask_val, table_mask)) < nr_cpu_ids) {
> +               cpu = cpumask_test_cpu(from, mask_val) &&
> +                     cpumask_test_cpu(from, table_mask) ?
> +                     from : common;
> +       } else {
>                 cpu = cpumask_first(mask_val);
> +       }
> 
> > 
> > I don't expect that to be horrible, but I also note that you don't
> > even talk about the trade-offs you are choosing to make.
> 
> With change above, I assume that the tradeoff is minor and can be ignored?

Yup, this works. My preference would be something which I find
slightly more readable though (avoiding assignment in the
conditional):

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index fca888b36680..299dafc7c0ea 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3826,9 +3826,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
 				bool force)
 {
 	struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
-	struct cpumask common, *table_mask;
+	struct cpumask *table_mask;
 	unsigned long flags;
-	int from, cpu;
+	int from, cpu = nr_cpu_ids;
 
 	/*
 	 * Changing affinity is mega expensive, so let's be as lazy as
@@ -3850,10 +3850,15 @@ static int its_vpe_set_affinity(struct irq_data *d,
 	 * If we are offered another CPU in the same GICv4.1 ITS
 	 * affinity, pick this one. Otherwise, any CPU will do.
 	 */
-	if (table_mask && cpumask_and(&common, mask_val, table_mask))
-		cpu = cpumask_test_cpu(from, &common) ? from : cpumask_first(&common);
-	else
+	if (table_mask)
+		cpu = cpumask_any_and(mask_val, table_mask);
+	if (cpu < nr_cpu_ids) {
+		 if (cpumask_test_cpu(from, mask_val) &&
+		     cpumask_test_cpu(from, table_mask))
+			 cpu = from;
+	} else {
 		cpu = cpumask_first(mask_val);
+	}
 
 	if (from == cpu)
 		goto out;

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ