linux-kernel - Re: [PATCH 6/9] numa,sched: normalize faults

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140128100131.GS4963@suse.de>
Date:	Tue, 28 Jan 2014 10:01:31 +0000
From:	Mel Gorman <mgorman@...e.de>
To:	riel@...hat.com
Cc:	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	peterz@...radead.org, mingo@...hat.com, chegu_vinod@...com
Subject: Re: [PATCH 6/9] numa,sched: normalize faults_cpu stats and weigh by
 CPU use

On Mon, Jan 27, 2014 at 05:03:45PM -0500, riel@...hat.com wrote:
> From: Rik van Riel <riel@...hat.com>
> 
> Tracing the code that decides the active nodes has made it abundantly clear
> that the naive implementation of the faults_from code has issues.
> 
> Specifically, the garbage collector in some workloads will access orders
> of magnitudes more memory than the threads that do all the active work.
> This resulted in the node with the garbage collector being marked the only
> active node in the group.
> 
> This issue is avoided if we weigh the statistics by CPU use of each task in
> the numa group, instead of by how many faults each thread has occurred.
> 
> To achieve this, we normalize the number of faults to the fraction of faults
> that occurred on each node, and then multiply that fraction by the fraction
> of CPU time the task has used since the last time task_numa_placement was
> invoked.
> 
> This way the nodes in the active node mask will be the ones where the tasks
> from the numa group are most actively running, and the influence of eg. the
> garbage collector and other do-little threads is properly minimized.
> 
> On a 4 node system, using CPU use statistics calculated over a longer interval
> results in about 1% fewer page migrations with two 32-warehouse specjbb runs
> on a 4 node system, and about 5% fewer page migrations, as well as 1% better
> throughput, with two 8-warehouse specjbb runs, as compared with the shorter
> term statistics kept by the scheduler.
> 
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Mel Gorman <mgorman@...e.de>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Chegu Vinod <chegu_vinod@...com>
> Signed-off-by: Rik van Riel <riel@...hat.com>

Major changes are related to the weight calculations to avoid overflow
and the avg runtime is calculated based on a longer runtime than the v4
version. Both seem sane so

Acked-by: Mel Gorman <mgorman@...e>

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/