linux-kernel - Re: [PATCH 2/6] numa,sched: track from which nodes NUMA faults are triggered

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <52DEF41F.1040105@redhat.com>
Date:	Tue, 21 Jan 2014 17:26:39 -0500
From:	Rik van Riel <riel@...hat.com>
To:	Mel Gorman <mgorman@...e.de>
CC:	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	peterz@...radead.org, mingo@...hat.com, chegu_vinod@...com
Subject: Re: [PATCH 2/6] numa,sched: track from which nodes NUMA faults are
 triggered

On 01/21/2014 07:21 AM, Mel Gorman wrote:
> On Mon, Jan 20, 2014 at 02:21:03PM -0500, riel@...hat.com wrote:

>> +++ b/include/linux/sched.h
>> @@ -1492,6 +1492,14 @@ struct task_struct {
>>  	unsigned long *numa_faults_buffer;
>>  
>>  	/*
>> +	 * Track the nodes where faults are incurred. This is not very
>> +	 * interesting on a per-task basis, but it help with smarter
>> +	 * numa memory placement for groups of processes.
>> +	 */
>> +	unsigned long *numa_faults_from;
>> +	unsigned long *numa_faults_from_buffer;
>> +
> 
> As an aside I wonder if we can derive any useful metric from this

It may provide for a better way to tune the numa scan interval
than the current code, since the "local vs remote" ratio is not
going to provide us much useful info when dealing with a workload
that is spread across multiple numa nodes.

>>  		grp->total_faults = p->total_numa_faults;
>> @@ -1526,7 +1536,7 @@ static void task_numa_group(struct task_struct *p, int cpupid, int flags,
>>  
>>  	double_lock(&my_grp->lock, &grp->lock);
>>  
>> -	for (i = 0; i < 2*nr_node_ids; i++) {
>> +	for (i = 0; i < 4*nr_node_ids; i++) {
>>  		my_grp->faults[i] -= p->numa_faults[i];
>>  		grp->faults[i] += p->numa_faults[i];
>>  	}
> 
> The same obscure trick is used throughout and I'm not sure how
> maintainable that will be. Would it be better to be explicit about this?

I have made a cleanup patch for this, using the defines you
suggested.

>> @@ -1634,6 +1649,7 @@ void task_numa_fault(int last_cpupid, int node, int pages, int flags)
>>  		p->numa_pages_migrated += pages;
>>  
>>  	p->numa_faults_buffer[task_faults_idx(node, priv)] += pages;
>> +	p->numa_faults_from_buffer[task_faults_idx(this_node, priv)] += pages;
>>  	p->numa_faults_locality[!!(flags & TNF_FAULT_LOCAL)] += pages;
> 
> this_node and node is similarly ambiguous in terms of name. Rename of
> data_node and cpu_node would have been clearer.

I added a patch in the next version of the series.

Don't want to make the series too large, though :)

-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/