lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z2Rlc8eljxSF0I0Z@yury-ThinkPad>
Date: Thu, 19 Dec 2024 10:26:59 -0800
From: Yury Norov <yury.norov@...il.com>
To: Tejun Heo <tj@...nel.org>
Cc: Andrea Righi <arighi@...dia.com>, David Vernet <void@...ifault.com>,
	Changwoo Min <changwoo@...lia.com>, Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/6] sched/topology: introduce for_each_numa_hop_node() /
 sched_numa_hop_node()

On Wed, Dec 18, 2024 at 06:04:53AM -1000, Tejun Heo wrote:
> Hello,
> 
> On Wed, Dec 18, 2024 at 11:23:40AM +0100, Andrea Righi wrote:
> ...
> > > So, this would work but given that there is nothing dynamic about this
> > > ordering, would it make more sense to build the ordering and store it
> > > per-node? Then, the iteration just becomes walking that array.
> > 
> > I've also considered doing that. I don't know if it'd work with offline
> > nodes, but maybe we can just check node_online(node) at each iteration and
> > skip those that are not online.

for_each_numa_hop_mask() only traverses N_CPU nodes, and N_CPU nodes have
proper distances.

I think that for_each_numa_hop_node() should match for_each_numa_hop_mask().
It would be good to cross-test them to ensure that they generate the same
order at least for N_CPU nodes.

If you think that for_each_numa_hop_node() should traverse non-N_CPU nodes,
you need a 'node_state' parameter. This will allow to make sure that at
least N_CPU portion works correctly.

> Yeah, there can be e.g. for_each_possible_node_by_dist() wheke nodes with
> unknown distances (offline ones?) are put at the end and then there's also
> for_each_online_node_by_dist() which filters out offline ones, and the
> ordering can be updated from a CPU hotplug callback.

We can assign UINT_MAX for those nodes I guess?

> The ordering can be
> probably put in an rcu protected array? I'm not sure what's the
> synchronization convention around node on/offlining. Is that protected
> together with CPU on/offlining?

The machinery is already there, we just need another array of nodemasks - 
sched_domains_numa_nodes in addition to sched_domains_numa_nodes. The
last one is already protected by RCU, and we need to update new array every
time when sched_domains_numa_nodes updated.
 
> Given that there usually aren't that many nodes, the current implementation
> is probably fine too, so please feel free to ignore this suggestion for now
> too.

I agree. The number of nodes on typical system is 1 or 2. Even if
it's 8, the Andrea's bubble sort will be still acceptable. So, I'm
OK with O(N^2) if you guys OK with it. I only would like to have
this choice explained in commit message.

Thanks,
Yury

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ