linux-kernel - Re: [PATCH] cpumask: Ensure the visibility of set_nr_cpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <Zrj2v4IFbS/1yPnz@vaxr-BM6660-BM6360>
Date: Mon, 12 Aug 2024 01:37:03 +0800
From: I Hsin Cheng <richard120310@...il.com>
To: Yury Norov <yury.norov@...il.com>
Cc: andriy.shevchenko@...ux.intel.com, linux@...musvillemoes.dk,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] cpumask: Ensure the visibility of set_nr_cpu_ids

On Sun, Aug 11, 2024 at 08:18:01AM -0700, Yury Norov wrote:
> On Sun, Aug 11, 2024 at 05:25:01PM +0800, I Hsin Cheng wrote:
> > The variable "nr_cpu_ids" is a system-wide variable which should be seen
> > as consistent all the time. For example it's set in one of the kernel
> > setup procedure "prefill_possible_map", the operations here should
> > happens before all the code after setup, which means the operations here
> > should be visible to all the code after setup.
> > 
> > set_cpu_possible() ensure it's visibility because it eventually falls
> > into an atomic instruction, however the function "set_nr_cpu_ids()"
> > fails to make the guarantee since it only performs a normal write
> > operations.
> 
> Set_cpu_possible() is a completely different thing.
>  
> > Adding the macro "WRITE_ONCE()" will prevent the compiler from re-order
> > the instruction of the write operation for "nr_cpu_ids", so we can
> > guarantee the operation is visible to all the codes coming after it.
> > 
> > Signed-off-by: I Hsin Cheng <richard120310@...il.com>
> 
> I don't understand this. nr_cpu_ids is initialized at compile time
> to NR_CPUS, to represent maximum number of bits in cpumasks.
> 
> Later on runtime we update nr_cpu_ids with an actual number of possible
> CPUs in the system. The type of the variable is unsigned int, and it
> means that threads accessing it will either fetch NR_CPUS, or new value
> coherently.
> 
> Having nr_cpu_ids == NR_CPUS is not an error, it's just a non-optimal
> value. The only effect of it is that kernel algorithms traverse unused
> part of cpumasks for the first few microseconds after boot.
> 
> Can you explain in details what type of race you're trying to fix?
> Which architecture? What is the race scenario?
> 
> > ---
> >  include/linux/cpumask.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > index f10fb87d4..3731f5e43 100644
> > --- a/include/linux/cpumask.h
> > +++ b/include/linux/cpumask.h
> > @@ -46,7 +46,7 @@ static inline void set_nr_cpu_ids(unsigned int nr)
> >  #if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)
> >  	WARN_ON(nr != nr_cpu_ids);
> >  #else
> > -	nr_cpu_ids = nr;
> > +	WRITE_ONCE(nr_cpu_ids, nr);
> 
> 
> WRITE_ONCE()? How is that supposed to work? The only possible effect
> would be reordering of a couple of instructions. How would that help
> threads running on other CPUs synchronize any better?
> 
> Regardless, WRITE_ONCE() should always be paired with READ_ONCE() to
> make it working. So, if we take this, we should also make every read of
> nr_cpu_ids by using READ_ONCE(). nr_cpu_ids is used in fast paths in
> many places, particularly as loop termination condition. Things like
> this:
> 
>         while (cpu < READ_ONCE(nr_cpu_ids))
>                 do_something_very_quick();
> 
> would definitely hit performance.
> 
> Thanks,
> Yury
> 
> >  #endif
> >  }
> >  
> > -- 
> > 2.34.1

Thanks for the reply and your detailed explanation, I wasn't trying to
fix any race conditions, I was thinking the same as you mentioned that
some of the cpumask's operation would traverse more times than it
actually needs to. I took set_cpu_possible() as an example trying to
express I think it would be nice for nr_cpu_ids to make synchronized
guarantees across CPUs, as WRITE_ONCE() would also provide write memory
barrier.

> Regardless, WRITE_ONCE() should always be paired with READ_ONCE() to
> make it working. So, if we take this, we should also make every read of
> nr_cpu_ids by using READ_ONCE(). nr_cpu_ids is used in fast paths in
> many places, particularly as loop termination condition. Things like
> this:
>
>         while (cpu < READ_ONCE(nr_cpu_ids))
>                 do_something_very_quick();
>
> would definitely hit performance.

Indeed if WRITCE_ONCE() is going to be applied here, every read of
nr_cpu_ids will have to use READ_ONCE() and it'll harm the fast path's
performance. I didn't considered this part, thanks for pointing it out.
Comparing to the number of redundant traversal, I think the performance
hit introduced by READ_ONCE() in fast path will be larger, so we should
stick to the same way. I learned alot and should reconsidered many parts
of my former knowledge about cpumask from your reply.
Thanks again for your detail explanation and kindly reply !

Best regards,
I-Hsin Cheng