[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c4678658cdd04d14ced7d0407da32f5fdec19f95.camel@linux.intel.com>
Date: Tue, 07 Jun 2022 15:51:57 -0700
From: Tim Chen <tim.c.chen@...ux.intel.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
linux-mm@...ck.org, akpm@...ux-foundation.org
Cc: Wei Xu <weixugc@...gle.com>, Huang Ying <ying.huang@...el.com>,
Greg Thelen <gthelen@...gle.com>,
Yang Shi <shy828301@...il.com>,
Davidlohr Bueso <dave@...olabs.net>,
Tim C Chen <tim.c.chen@...el.com>,
Brice Goglin <brice.goglin@...il.com>,
Michal Hocko <mhocko@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Hesham Almatary <hesham.almatary@...wei.com>,
Dave Hansen <dave.hansen@...el.com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Alistair Popple <apopple@...dia.com>,
Dan Williams <dan.j.williams@...el.com>,
Feng Tang <feng.tang@...el.com>,
Jagdish Gediya <jvgediya@...ux.ibm.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
David Rientjes <rientjes@...gle.com>
Subject: Re: [PATCH v5 4/9] mm/demotion: Build demotion targets based on
explicit memory tiers
On Fri, 2022-06-03 at 19:12 +0530, Aneesh Kumar K.V wrote:
>
> +int next_demotion_node(int node)
> +{
> + struct demotion_nodes *nd;
> + int target, nnodes, i;
> +
> + if (!node_demotion)
> + return NUMA_NO_NODE;
> +
> + nd = &node_demotion[node];
> +
> + /*
> + * node_demotion[] is updated without excluding this
> + * function from running.
> + *
> + * Make sure to use RCU over entire code blocks if
> + * node_demotion[] reads need to be consistent.
> + */
> + rcu_read_lock();
> +
> + nnodes = nodes_weight(nd->preferred);
> + if (!nnodes)
> + return NUMA_NO_NODE;
> +
> + /*
> + * If there are multiple target nodes, just select one
> + * target node randomly.
> + *
> + * In addition, we can also use round-robin to select
> + * target node, but we should introduce another variable
> + * for node_demotion[] to record last selected target node,
> + * that may cause cache ping-pong due to the changing of
> + * last target node. Or introducing per-cpu data to avoid
> + * caching issue, which seems more complicated. So selecting
> + * target node randomly seems better until now.
> + */
> + nnodes = get_random_int() % nnodes;
> + target = first_node(nd->preferred);
> + for (i = 0; i < nnodes; i++)
> + target = next_node(target, nd->preferred);
We can simplify the above 4 lines.
target = node_random(nd->preferred);
There's still a loop overhead though :(
> +
> + rcu_read_unlock();
> +
> + return target;
> +}
> +
>
> + */
> +static int __meminit migrate_on_reclaim_callback(struct notifier_block *self,
> + unsigned long action, void *_arg)
> +{
> + struct memory_notify *arg = _arg;
> +
> + /*
> + * Only update the node migration order when a node is
> + * changing status, like online->offline.
> + */
> + if (arg->status_change_nid < 0)
> + return notifier_from_errno(0);
> +
> + switch (action) {
> + case MEM_OFFLINE:
> + /*
> + * In case we are moving out of N_MEMORY. Keep the node
> + * in the memory tier so that when we bring memory online,
> + * they appear in the right memory tier. We still need
> + * to rebuild the demotion order.
> + */
> + mutex_lock(&memory_tier_lock);
> + establish_migration_targets();
> + mutex_unlock(&memory_tier_lock);
> + break;
> + case MEM_ONLINE:
> + /*
> + * We ignore the error here, if the node already have the tier
> + * registered, we will continue to use that for the new memory
> + * we are adding here.
> + */
> + node_set_memory_tier(arg->status_change_nid, DEFAULT_MEMORY_TIER);
Should establish_migration_targets() be run here? Otherwise what are the
demotion targets for this newly onlined node?
> + break;
> + }
> +
> + return notifier_from_errno(0);
> +}
> +
Tim
Powered by blists - more mailing lists