[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140516133805.GS11096@twins.programming.kicks-ass.net>
Date: Fri, 16 May 2014 15:38:05 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Rik van Riel <riel@...hat.com>
Cc: linux-kernel@...r.kernel.org, chegu_vinod@...com, mingo@...nel.org,
umgwanakikbuti@...il.com
Subject: Re: [PATCH RFC] sched,numa: move tasks to preferred_node at wakeup
time
On Fri, May 16, 2014 at 02:14:50AM -0400, Rik van Riel wrote:
> +#ifdef CONFIG_NUMA_BALANCING
> +static int numa_balance_on_wake(struct task_struct *p, int prev_cpu)
> +{
> + long load, src_load, dst_load;
> + int cur_node = cpu_to_node(prev_cpu);
> + struct numa_group *numa_group = ACCESS_ONCE(p->numa_group);
> + struct sched_domain *sd;
> + struct task_numa_env env = {
> + .p = p,
> + .best_task = NULL,
> + .best_imp = 0,
> + .best_cpu = -1
> + };
That's all code, ideally you'd move that after we're done checking the
reasons to not do work, say somehere like...
> +
> + if (!sched_feat(NUMA))
> + return prev_cpu;
Yah.. :-( I think some people changed that to numabalancing_enabled.
Fixing that is still on the todo list somewhere.
> +
> + if (p->numa_preferred_nid == -1)
> + return prev_cpu;
> +
> + if (p->numa_preferred_nid == cur_node);
> + return prev_cpu;
> +
> + if (numa_group && node_isset(cur_node, numa_group->active_nodes))
> + return prev_cpu;
> +
> + sd = rcu_dereference(per_cpu(sd_numa, env.src_cpu));
> + if (sd)
> + env.imbalance_pct = 100 + (sd->imbalance_pct - 100) / 2;
> +
> + /*
> + * Cpusets can break the scheduler domain tree into smaller
> + * balance domains, some of which do not cross NUMA boundaries.
> + * Tasks that are "trapped" in such domains cannot be migrated
> + * elsewhere, so there is no point in (re)trying.
> + */
> + if (unlikely(!sd)) {
How about you bail early, and then have the above test evaporate?
> + p->numa_preferred_nid = cur_node;
> + return prev_cpu;
> + }
.. here.
> +
> + /*
> + * Only allow p to move back to its preferred nid if
> + * that does not create an imbalance that would cause
> + * the load balancer to move a task around later.
> + */
> + env.src_nid = cur_node;
> + env.dst_nid = p->numa_preferred_nid;
> +
> + update_numa_stats(&env.src_stats, env.src_nid);
> + update_numa_stats(&env.dst_stats, env.dst_nid);
> +
> + dst_load = env.dst_stats.load;
> + src_load = env.src_stats.load;
> +
> + /* XXX missing power terms */
> + load = task_h_load(p);
> + dst_load += load;
> + src_load -= load;
> +
> + if (load_too_imbalanced(env.src_stats.load, env.dst_stats.load,
> + src_load, dst_load, &env))
> + return prev_cpu;
So I'm thinking that load_too_imbalanced() is from another patch I
haven't yet seen, lemme go see if you did send it and I missed it.
> +
> + return cpumask_first(cpumask_of_node(p->numa_preferred_nid));
> +}
Content of type "application/pgp-signature" skipped
Powered by blists - more mailing lists