[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yd7pKuvjayH4q14L@hirez.programming.kicks-ass.net>
Date: Wed, 12 Jan 2022 15:43:54 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Gang Li <ligang.bdlg@...edance.com>
Cc: Jonathan Corbet <corbet@....net>, Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
linux-api@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-doc@...r.kernel.org
Subject: Re: [PATCH v3] sched/numa: add per-process numa_balancing
On Mon, Dec 06, 2021 at 10:45:28AM +0800, Gang Li wrote:
> This patch add a new api PR_NUMA_BALANCING in prctl.
>
> A large number of page faults will cause performance loss when numa
> balancing is performing. Thus those processes which care about worst-case
> performance need numa balancing disabled. Others, on the contrary, allow a
> temporary performance loss in exchange for higher average performance, so
> enable numa balancing is better for them.
>
> Numa balancing can only be controlled globally by
> /proc/sys/kernel/numa_balancing. Due to the above case, we want to
> disable/enable numa_balancing per-process instead.
>
> Add numa_balancing under mm_struct. Then use it in task_tick_fair.
>
> Set per-process numa balancing:
> prctl(PR_NUMA_BALANCING, PR_SET_NUMAB_DISABLE); //disable
> prctl(PR_NUMA_BALANCING, PR_SET_NUMAB_ENABLE); //enable
> prctl(PR_NUMA_BALANCING, PR_SET_NUMAB_DEFAULT); //follow global
This seems to imply you can prctl(ENABLE) even if the global is
disabled, IOW sched_numa_balancing is off.
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 884f29d07963..2980f33ac61f 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -11169,8 +11169,12 @@ static void task_tick_fair(struct rq *rq, struct task_struct *curr, int queued)
> entity_tick(cfs_rq, se, queued);
> }
>
> - if (static_branch_unlikely(&sched_numa_balancing))
> +#ifdef CONFIG_NUMA_BALANCING
> + if (curr->mm && (curr->mm->numab_enabled == NUMAB_ENABLED
> + || (static_branch_unlikely(&sched_numa_balancing)
> + && curr->mm->numab_enabled == NUMAB_DEFAULT)))
> task_tick_numa(rq, curr);
> +#endif
>
> update_misfit_status(curr, rq);
> update_overutilized_status(task_rq(curr));
There's just about everything wrong there... not least of all the
horrific coding style.
Powered by blists - more mailing lists