lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZAEPWQrdZd1N1rkn@tpad>
Date:   Thu, 2 Mar 2023 18:04:25 -0300
From:   Marcelo Tosatti <mtosatti@...hat.com>
To:     Peter Xu <peterx@...hat.com>
Cc:     Christoph Lameter <cl@...ux.com>,
        Aaron Tomlin <atomlin@...mlin.com>,
        Frederic Weisbecker <frederic@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v2 02/11] this_cpu_cmpxchg: ARM64: switch
 this_cpu_cmpxchg to locked, add _local function

On Thu, Mar 02, 2023 at 03:53:12PM -0500, Peter Xu wrote:
> On Thu, Feb 09, 2023 at 12:01:52PM -0300, Marcelo Tosatti wrote:
> > Goal is to have vmstat_shepherd to transfer from
> > per-CPU counters to global counters remotely. For this, 
> > an atomic this_cpu_cmpxchg is necessary.
> > 
> > Following the kernel convention for cmpxchg/cmpxchg_local,
> > change ARM's this_cpu_cmpxchg_ helpers to be atomic,
> > and add this_cpu_cmpxchg_local_ helpers which are not atomic.
> 
> I can follow on the necessity of having the _local version, however two
> questions below.
> 
> > 
> > Signed-off-by: Marcelo Tosatti <mtosatti@...hat.com>
> > 
> > Index: linux-vmstat-remote/arch/arm64/include/asm/percpu.h
> > ===================================================================
> > --- linux-vmstat-remote.orig/arch/arm64/include/asm/percpu.h
> > +++ linux-vmstat-remote/arch/arm64/include/asm/percpu.h
> > @@ -232,13 +232,23 @@ PERCPU_RET_OP(add, add, ldadd)
> >  	_pcp_protect_return(xchg_relaxed, pcp, val)
> >  
> >  #define this_cpu_cmpxchg_1(pcp, o, n)	\
> > -	_pcp_protect_return(cmpxchg_relaxed, pcp, o, n)
> > +	_pcp_protect_return(cmpxchg, pcp, o, n)
> >  #define this_cpu_cmpxchg_2(pcp, o, n)	\
> > -	_pcp_protect_return(cmpxchg_relaxed, pcp, o, n)
> > +	_pcp_protect_return(cmpxchg, pcp, o, n)
> >  #define this_cpu_cmpxchg_4(pcp, o, n)	\
> > -	_pcp_protect_return(cmpxchg_relaxed, pcp, o, n)
> > +	_pcp_protect_return(cmpxchg, pcp, o, n)
> >  #define this_cpu_cmpxchg_8(pcp, o, n)	\
> > +	_pcp_protect_return(cmpxchg, pcp, o, n)
> 
> This makes this_cpu_cmpxchg_*() not only non-local, but also (especially
> for arm64) memory barrier implications since cmpxchg() has a strong memory
> barrier, while the old this_cpu_cmpxchg*() doesn't have, afaiu.
> 
> Maybe it's not a big deal if the audience of this helper is still limited
> (e.g. we can add memory barriers if we don't want strict ordering
> implication), but just to check with you on whether it's intended, and if
> so whether it may worth some comments.

It happens that on ARM-64 cmpxchg_local == cmpxchg_relaxed.

See cf10b79a7d88edc689479af989b3a88e9adf07ff.

This patchset maintains the current behaviour
of this_cpu_cmpxch (for this_cpu_cmpxch_local), which was:

 #define this_cpu_cmpxchg_1(pcp, o, n)  \                                                                           
-       _pcp_protect_return(cmpxchg_relaxed, pcp, o, n)
+       _pcp_protect_return(cmpxchg, pcp, o, n)
 #define this_cpu_cmpxchg_2(pcp, o, n)  \                                                                           
-       _pcp_protect_return(cmpxchg_relaxed, pcp, o, n)
+       _pcp_protect_return(cmpxchg, pcp, o, n)
 #define this_cpu_cmpxchg_4(pcp, o, n)  \                                                                           
-       _pcp_protect_return(cmpxchg_relaxed, pcp, o, n)
+       _pcp_protect_return(cmpxchg, pcp, o, n)
 #define this_cpu_cmpxchg_8(pcp, o, n)  \                                                                           
+       _pcp_protect_return(cmpxchg, pcp, o, n)

> > +
> > +#define this_cpu_cmpxchg_local_1(pcp, o, n)	\
> >  	_pcp_protect_return(cmpxchg_relaxed, pcp, o, n)
> > +#define this_cpu_cmpxchg_local_2(pcp, o, n)	\
> > +	_pcp_protect_return(cmpxchg_relaxed, pcp, o, n)
> > +#define this_cpu_cmpxchg_local_4(pcp, o, n)	\
> > +	_pcp_protect_return(cmpxchg_relaxed, pcp, o, n)
> > +#define this_cpu_cmpxchg_local_8(pcp, o, n)	\
> > +	_pcp_protect_return(cmpxchg_relaxed, pcp, o, n)
> 
> I think cmpxchg_relaxed()==cmpxchg_local() here for aarch64, however should
> we still use cmpxchg_local() to pair with this_cpu_cmpxchg_local_*()?

Since cmpxchg_local = cmpxchg_relaxed, seems like this is not necessary.

> Nothing about your patch along since it was the same before, but I'm
> wondering whether this is a good time to switchover.

I would say that another patch is more appropriate to change this, 
if desired.

> The other thing is would it be good to copy arch-list for each arch patch?
> Maybe it'll help to extend the audience too.

Yes, should have done that (or CC each individual maintainer). Will do
on next version.

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ