lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5582EC99.8040005@redhat.com>
Date:	Thu, 18 Jun 2015 12:06:49 -0400
From:	Rik van Riel <riel@...hat.com>
To:	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
CC:	linux-kernel@...r.kernel.org, peterz@...radead.org,
	mingo@...nel.org, mgorman@...e.de
Subject: Re: [PATCH] sched,numa: document and fix numa_preferred_nid setting

On 06/18/2015 11:55 AM, Srikar Dronamraju wrote:
>>  	if (p->numa_group) {
>>  		if (env.best_cpu == -1)
>> @@ -1513,7 +1520,7 @@ static int task_numa_migrate(struct task_struct *p)
>>  			nid = env.dst_nid;
>>
>>  		if (node_isset(nid, p->numa_group->active_nodes))
>> -			sched_setnuma(p, env.dst_nid);
>> +			sched_setnuma(p, nid);
>>  	}
>>
>>  	/* No better CPU than the current one was found. */
>>
> 
> Overall this patch does seem to produce better results. However numa02
> gets affected -vely.

OK, that is kind of expected.

The way numa02 runs means that we are essentially guaranteed
that, on a two node system, both nodes end up in the numa_group's
active_mask.

What the above change does is slow down migration if a task ends
up in a NUMA node in p->numa_group->active_nodes.

This is necessary if a very large workload has already converged
on a set of NUMA nodes, but it does slow down convergence for such
workloads.

I can't think of any obvious way to both slow down movement once
things have converged, yet keep speedy movement of tasks when they
have not yet converged.

It is worth noting that all the numa01 and numa02 benchmarks
measure is the speed at which the workloads converge. It does not
measure the overhead of making things converge, or how fast an
actual workload runs (NUMA locality benefit, minus NUMA placement
overhead).

> KernelVersion: 4.1.0-rc7-tip
> 	Testcase:         Min         Max         Avg      StdDev
>   elapsed_numa01:      858.85      949.18      915.64       33.06
>   elapsed_numa02:       23.09       29.89       26.43        2.18
> 	Testcase:         Min         Max         Avg      StdDev
>    system_numa01:     1516.72     1855.08     1686.24      113.95
>    system_numa02:       63.69       79.06       70.35        5.87
> 	Testcase:         Min         Max         Avg      StdDev
>      user_numa01:    73284.76    80818.21    78060.88     2773.60
>      user_numa02:     1690.18     2071.07     1821.64      140.25
> 	Testcase:         Min         Max         Avg      StdDev
>     total_numa01:    74801.50    82572.60    79747.12     2875.61
>     total_numa02:     1753.87     2142.77     1891.99      143.59
> 
> KernelVersion: 4.1.0-rc7-tip + your patch
> 
> 	Testcase:         Min         Max         Avg      StdDev     %Change
>   elapsed_numa01:      665.26      877.47      776.77       79.23      15.83%
>   elapsed_numa02:       24.59       31.30       28.17        2.48      -5.56%
> 	Testcase:         Min         Max         Avg      StdDev     %Change
>    system_numa01:      659.57     1220.99      942.36      234.92      60.92%
>    system_numa02:       44.62       86.01       64.64       14.24       6.64%
> 	Testcase:         Min         Max         Avg      StdDev     %Change
>      user_numa01:    56280.95    75908.81    64993.57     7764.30      17.21%
>      user_numa02:     1790.35     2155.02     1916.12      132.57      -4.38%
> 	Testcase:         Min         Max         Avg      StdDev     %Change
>     total_numa01:    56940.50    77128.20    65935.92     7993.49      17.91%
>     total_numa02:     1834.97     2227.03     1980.76      136.51      -3.99%
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ