linux-kernel - Re: [PATCH] sched,numa: document and fix numa_preferred

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5582EC99.8040005@redhat.com>
Date:	Thu, 18 Jun 2015 12:06:49 -0400
From:	Rik van Riel <riel@...hat.com>
To:	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
CC:	linux-kernel@...r.kernel.org, peterz@...radead.org,
	mingo@...nel.org, mgorman@...e.de
Subject: Re: [PATCH] sched,numa: document and fix numa_preferred_nid setting

On 06/18/2015 11:55 AM, Srikar Dronamraju wrote:
>>  	if (p->numa_group) {
>>  		if (env.best_cpu == -1)
>> @@ -1513,7 +1520,7 @@ static int task_numa_migrate(struct task_struct *p)
>>  			nid = env.dst_nid;
>>
>>  		if (node_isset(nid, p->numa_group->active_nodes))
>> -			sched_setnuma(p, env.dst_nid);
>> +			sched_setnuma(p, nid);
>>  	}
>>
>>  	/* No better CPU than the current one was found. */
>>
> 
> Overall this patch does seem to produce better results. However numa02
> gets affected -vely.

OK, that is kind of expected.

The way numa02 runs means that we are essentially guaranteed
that, on a two node system, both nodes end up in the numa_group's
active_mask.

What the above change does is slow down migration if a task ends
up in a NUMA node in p->numa_group->active_nodes.

This is necessary if a very large workload has already converged
on a set of NUMA nodes, but it does slow down convergence for such
workloads.

I can't think of any obvious way to both slow down movement once
things have converged, yet keep speedy movement of tasks when they
have not yet converged.

It is worth noting that all the numa01 and numa02 benchmarks
measure is the speed at which the workloads converge. It does not
measure the overhead of making things converge, or how fast an
actual workload runs (NUMA locality benefit, minus NUMA placement
overhead).

> KernelVersion: 4.1.0-rc7-tip
> 	Testcase:         Min         Max         Avg      StdDev
>   elapsed_numa01:      858.85      949.18      915.64       33.06
>   elapsed_numa02:       23.09       29.89       26.43        2.18
> 	Testcase:         Min         Max         Avg      StdDev
>    system_numa01:     1516.72     1855.08     1686.24      113.95
>    system_numa02:       63.69       79.06       70.35        5.87
> 	Testcase:         Min         Max         Avg      StdDev
>      user_numa01:    73284.76    80818.21    78060.88     2773.60
>      user_numa02:     1690.18     2071.07     1821.64      140.25
> 	Testcase:         Min         Max         Avg      StdDev
>     total_numa01:    74801.50    82572.60    79747.12     2875.61
>     total_numa02:     1753.87     2142.77     1891.99      143.59
> 
> KernelVersion: 4.1.0-rc7-tip + your patch
> 
> 	Testcase:         Min         Max         Avg      StdDev     %Change
>   elapsed_numa01:      665.26      877.47      776.77       79.23      15.83%
>   elapsed_numa02:       24.59       31.30       28.17        2.48      -5.56%
> 	Testcase:         Min         Max         Avg      StdDev     %Change
>    system_numa01:      659.57     1220.99      942.36      234.92      60.92%
>    system_numa02:       44.62       86.01       64.64       14.24       6.64%
> 	Testcase:         Min         Max         Avg      StdDev     %Change
>      user_numa01:    56280.95    75908.81    64993.57     7764.30      17.21%
>      user_numa02:     1790.35     2155.02     1916.12      132.57      -4.38%
> 	Testcase:         Min         Max         Avg      StdDev     %Change
>     total_numa01:    56940.50    77128.20    65935.92     7993.49      17.91%
>     total_numa02:     1834.97     2227.03     1980.76      136.51      -3.99%
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/