lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z68E_ar8l7vNOxgh@gpd3>
Date: Fri, 14 Feb 2025 09:55:25 +0100
From: Andrea Righi <arighi@...dia.com>
To: Yury Norov <yury.norov@...il.com>
Cc: Tejun Heo <tj@...nel.org>, David Vernet <void@...ifault.com>,
	Changwoo Min <changwoo@...lia.com>, Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Joel Fernandes <joel@...lfernandes.org>, Ian May <ianm@...dia.com>,
	bpf@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/7] mm/numa: Introduce nearest_node_nodemask()

Hi Yury,

On Thu, Feb 13, 2025 at 12:12:46PM -0500, Yury Norov wrote:
...
> > > >  include/linux/numa.h |  7 +++++++
> > > >  mm/mempolicy.c       | 32 ++++++++++++++++++++++++++++++++
> > > >  2 files changed, 39 insertions(+)
> > > > 
> > > > diff --git a/include/linux/numa.h b/include/linux/numa.h
> > > > index 31d8bf8a951a7..e6baaf6051bcf 100644
> > > > --- a/include/linux/numa.h
> > > > +++ b/include/linux/numa.h
> > > > @@ -31,6 +31,8 @@ void __init alloc_offline_node_data(int nid);
> > > >  /* Generic implementation available */
> > > >  int numa_nearest_node(int node, unsigned int state);
> > > >  
> > > > +int nearest_node_nodemask(int node, nodemask_t *mask);
> > > > +
> > > 
> > > See how you use it. It looks a bit inconsistent to the other functions:
> > > 
> > >   #define for_each_node_numadist(node, unvisited)                                \
> > >          for (int start = (node),                                                \
> > >               node = nearest_node_nodemask((start), &(unvisited));               \
> > >               node < MAX_NUMNODES;                                               \
> > >               node_clear(node, (unvisited)),                                     \
> > >               node = nearest_node_nodemask((start), &(unvisited)))
> > >   
> > > 
> > > I would suggest to make it aligned with the rest of the API:
> > > 
> > >   #define node_clear(node, dst) __node_clear((node), &(dst))
> > >   static __always_inline void __node_clear(int node, volatile nodemask_t *dstp)
> > >   {
> > >           clear_bit(node, dstp->bits);
> > >   }
> > 
> > Sorry Yury, can you elaborate more on this? What do you mean with
> > inconsistent, is it the volatile nodemask_t *?
> 
> What I mean is:
>   #define nearest_node_nodemask(start, srcp)
>                 __nearest_node_nodemask((start), &(srcp))
>   int __nearest_node_nodemask(int node, nodemask_t *mask);

This all makes sense assuming that nearest_node_nodemask() is placed in
include/linux/nodemask.h and is considered as a nodemask API, but I thought
we determined to place it in include/linux/numa.h, since it seems more of a
NUMA API, similar to numa_nearest_node(), so under this assumption I was
planning to follow the same style of numa_nearest_node().

Or do you think it should go in linux/nodemask.h and follow the style of
the other nodemask APIs?

> 
> That way you'll be able to make the above for-loop looking more
> uniform:
> 
>   #define for_each_node_numadist(node, unvisited)                                \
>          for (int __s = (node),                                                \
>               (node) = nearest_node_nodemask(__s, (unvisited));               \
>               (node) < MAX_NUMNODES;                                               \
>               node_clear((node), (unvisited)),                                     \
>               (node) = nearest_node_nodemask(__s, (unvisited)))
> 
> > > >  #ifndef memory_add_physaddr_to_nid
> > > >  int memory_add_physaddr_to_nid(u64 start);
> > > >  #endif
> > > > @@ -47,6 +49,11 @@ static inline int numa_nearest_node(int node, unsigned int state)
> > > >  	return NUMA_NO_NODE;
> > > >  }
> > > >  
> > > > +static inline int nearest_node_nodemask(int node, nodemask_t *mask)
> > > > +{
> > > > +	return NUMA_NO_NODE;
> > > > +}
> > > > +
> > > >  static inline int memory_add_physaddr_to_nid(u64 start)
> > > >  {
> > > >  	return 0;
> > > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > > > index 162407fbf2bc7..1e2acf187ea3a 100644
> > > > --- a/mm/mempolicy.c
> > > > +++ b/mm/mempolicy.c
> > > > @@ -196,6 +196,38 @@ int numa_nearest_node(int node, unsigned int state)
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(numa_nearest_node);
> > > >  
> > > > +/**
> > > > + * nearest_node_nodemask - Find the node in @mask at the nearest distance
> > > > + *			   from @node.
> > > > + *
> > > > + * @node: the node to start the search from.
> > > > + * @mask: a pointer to a nodemask representing the allowed nodes.
> > > > + *
> > > > + * This function iterates over all nodes in the given state and calculates
> > > > + * the distance to the starting node.
> > > > + *
> > > > + * Returns the node ID in @mask that is the closest in terms of distance
> > > > + * from @node, or MAX_NUMNODES if no node is found.
> > > > + */
> > > > +int nearest_node_nodemask(int node, nodemask_t *mask)
> > > > +{
> > > > +	int dist, n, min_dist = INT_MAX, min_node = MAX_NUMNODES;
> > > > +
> > > > +	if (node == NUMA_NO_NODE)
> > > > +		return MAX_NUMNODES;
> > > 
> > > This makes it unclear: you make it legal to pass NUMA_NO_NODE, but
> > > your function returns something useless. I don't think it would help
> > > users in any reasonable scenario.
> > > 
> > > So, if you don't want user to call this with node == NUMA_NO_NODE,
> > > just describe it in comment on top of the function. Otherwise, please
> > > do something useful like 
> > > 
> > > 	if (node == NUMA_NO_NODE)
> > > 		node = current_node;
> > > 
> > > I would go with option 1. Notice, node_distance() doesn't bother to
> > > check against NUMA_NO_NODE.
> > 
> > Hm... is it? Looking at __node_distance(), it doesn't seem really safe to
> > pass a negative value (maybe I'm missing something?).
> 
> It's not safe, but inside the kernel we don't check parameters. Out of
> your courtesy you may decide to put a comment, but strictly speaking you
> don't have to.
> 
> > Anyway, I'd also prefer to go with option 1 and not implicitly assuming
> > NUMA_NO_NODE == current node (it feels that it might hide nasty bugs).
> 
> Yeah, very true
> 
> > So, I can add a comment in the description to clarify that NUMA_NO_NODE is
> > forbidenx, but what is someone is passing it? Should we WARN_ON_ONCE() at
> > least?
> 
> He will brick his testing board, and learn to read comments in a hard
> way.
> 
> Speaking more seriously, you will be most likely CCed as an author of
> that function, and you will be able to comment that on review. Also,
> there's a great chance that it will be caught by KASAN or some other
> sanitation tool even before someone sends a buggy patch.
> 
> This is an old as the world and very well known problem, and everyone
> is aware. 

Ok, makes sense, I'll just clarify this in the comment then.

Thanks,
-Andrea

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ