[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20160517141308.8a2df2028d19d375d3660e1f@linux-foundation.org>
Date: Tue, 17 May 2016 14:13:08 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Andi Kleen <andi@...stfloor.org>
Cc: tglx@...utronix.de, linux-kernel@...r.kernel.org,
Andi Kleen <ak@...ux.intel.com>
Subject: Re: [PATCH] Allocate idle task for a CPU always on its local node
On Tue, 17 May 2016 06:44:54 -0700 Andi Kleen <andi@...stfloor.org> wrote:
> From: Andi Kleen <ak@...ux.intel.com>
>
> Linux pre-allocates the task structs of the idle tasks for all possible CPUs.
> This currently means they all end up on node 0. This also implies
> that the cache line of MWAIT, which is around the flags field in the task
> struct, are all located in node 0.
>
> We see a noticeable performance improvement on Knights Landing CPUs when
> the cache lines used for MWAIT are located in the local nodes of the CPUs
> using them. I would expect this to give a (likely slight) improvement
> on other systems too.
>
> The patch implements placing the idle task in the node of
> its CPUs, by passing the right target node to copy_process()
>
Looks nice.
This is nicer ;)
From: Andrew Morton <akpm@...ux-foundation.org>
Subject: allocate-idle-task-for-a-cpu-always-on-its-local-node-fix
use NUMA_NO_NODE, not a bare -1
Cc: Andi Kleen <ak@...ux.intel.com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
---
kernel/fork.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff -puN kernel/fork.c~allocate-idle-task-for-a-cpu-always-on-its-local-node-fix kernel/fork.c
--- a/kernel/fork.c~allocate-idle-task-for-a-cpu-always-on-its-local-node-fix
+++ a/kernel/fork.c
@@ -346,7 +346,7 @@ static struct task_struct *dup_task_stru
struct thread_info *ti;
int err;
- if (node < 0)
+ if (node == NUMA_NO_NODE)
node = tsk_fork_get_node(orig);
tsk = alloc_task_struct_node(node);
if (!tsk)
@@ -1754,7 +1754,7 @@ long _do_fork(unsigned long clone_flags,
}
p = copy_process(clone_flags, stack_start, stack_size,
- child_tidptr, NULL, trace, tls, -1);
+ child_tidptr, NULL, trace, tls, NUMA_NO_NODE);
/*
* Do this prior waking up the new thread - the thread pointer
* might get invalid after that point, if the thread exits quickly.
_
Powered by blists - more mailing lists