lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Tue, 17 Dec 2013 15:41:35 -0500
From:	Rik van Riel <riel@...hat.com>
To:	David Timothy Strauss <david@...idstrauss.net>
CC:	Mel Gorman <mgorman@...e.de>, Ingo Molnar <mingo@...nel.org>,
	linux-kernel@...r.kernel.org
Subject: Re: NUMA, migrate/N, and tuned-adm

On 12/17/2013 01:10 PM, David Timothy Strauss wrote:

> System specs:
>  * Fedora 19 with the 3.11.10-200.fc19.x86_64 kernel (just the stock RPM)
>  * Bare-metal servers with 128GB RAM split between two NUMA regions,
> each region with one hex-core processor
>  * More than 700 processes, a couple hundred of which are active
> fairly frequently. The systems were at 7000 processes, but we've
> dropped it while we dive into this issue.
>  * Many of the processes are short-lived. The long-lived ones
> experience spikes in CPU and memory usage while processing requests.
> 
> Here's what we've tried, to no avail:
>  * tuned-adm on latency-performance and virtual-host profiles; this
> places the system on the deadline scheduler, but this problem occurred
> on the default one too
>  * kernel.sched_migration_cost_ns=5000000 (which tuned will do for
> those profiles in v3.3/Fedora 20)
>  * numad to balance between regions
>  * Global use of sched_relax_domain_level=1 and sched_relax_domain_level=2
>  * Splitting the system with cpuset into management tasks (6 virtual
> cores) and workload tasks (18 virtual cores) with
> sched_relax_domain_level=2. This is based on recommendations for NUMA
> systems in the cpuset man page.

Just for a quick sanity check, can you try disabling the
automatic numa balancing code?

# echo NO_NUMA > /sys/kernel/debug/sched_features

> Here's what we've used for analysis:
>  * powertop
>  * top/htop
>  * perf record -a -g

Does "perf report -g" show where the calls to the
migration code are coming from? Something must be
migrating tasks around, and it will be good to know
what it is...

-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ