[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64N.0701251336080.13206@attu4.cs.washington.edu>
Date: Thu, 25 Jan 2007 13:37:33 -0800 (PST)
From: David Rientjes <rientjes@...washington.edu>
To: Andrew Morton <akpm@...l.org>
cc: Andi Kleen <ak@...e.de>, Rohit Seth <rohitseth@...gle.com>,
linux-kernel@...r.kernel.org
Subject: [patch -mm 5/5] x86_64: fake numa for cpusets document
Create a document to explain how to use numa=fake in conjunction with
cpusets for coarse memory resource management.
An attempt to get more awareness and testing for this feature.
Cc: Andi Kleen <ak@...e.de>
Signed-off-by: David Rientjes <rientjes@...washington.edu>
---
Documentation/x86_64/fake-numa-for-cpusets | 66 ++++++++++++++++++++++++++++
1 files changed, 66 insertions(+), 0 deletions(-)
create mode 100644 Documentation/x86_64/fake-numa-for-cpusets
diff --git a/Documentation/x86_64/fake-numa-for-cpusets b/Documentation/x86_64/fake-numa-for-cpusets
new file mode 100644
index 0000000..d1a985c
--- /dev/null
+++ b/Documentation/x86_64/fake-numa-for-cpusets
@@ -0,0 +1,66 @@
+Using numa=fake and CPUSets for Resource Management
+Written by David Rientjes <rientjes@...washington.edu>
+
+This document describes how the numa=fake x86_64 command-line option can be used
+in conjunction with cpusets for coarse memory management. Using this feature,
+you can create fake NUMA nodes that represent contiguous chunks of memory and
+assign them to cpusets and their attached tasks. This is a way of limiting the
+amount of system memory that are available to a certain class of tasks.
+
+For more information on the features of cpusets, see Documentation/cpusets.txt.
+There are a number of different configurations you can use for your needs. For
+more information on the numa=fake command line option and its various ways of
+configuring fake nodes, see Documentation/x86_64/boot-options.txt.
+
+For the purposes of this introduction, we'll assume a very primitive NUMA
+emulation setup of "numa=fake=4*512,". This will split our system memory into
+four equal chunks of 512M each that we can now use to assign to cpusets. As
+you become more familiar with using this combination for resource control,
+you'll determine a better setup to minimize the number of nodes you have to deal
+with.
+
+A machine may be split as follows with "numa=fake=4*512," as reported by dmesg:
+
+ Faking node 0 at 0000000000000000-0000000020000000 (512MB)
+ Faking node 1 at 0000000020000000-0000000040000000 (512MB)
+ Faking node 2 at 0000000040000000-0000000060000000 (512MB)
+ Faking node 3 at 0000000060000000-0000000080000000 (512MB)
+ ...
+ On node 0 totalpages: 130975
+ On node 1 totalpages: 131072
+ On node 2 totalpages: 131072
+ On node 3 totalpages: 131072
+
+Now following the instructions for mounting the cpusets filesystem from
+Documentation/cpusets.txt, you can assign fake nodes (i.e. contiguous memory
+address spaces) to individual cpusets:
+
+ [root@...ads /]# mkdir exampleset
+ [root@...ads /]# mount -t cpuset none exampleset
+ [root@...ads /]# mkdir exampleset/ddset
+ [root@...ads /]# cd exampleset/ddset
+ [root@...ads /exampleset/ddset]# echo 0-1 > cpus
+ [root@...ads /exampleset/ddset]# echo 0-1 > mems
+
+Now this cpuset, 'ddset', will only allowed access to fake nodes 0 and 1 for
+memory allocations (1G).
+
+You can now assign tasks to these cpusets to limit the memory resources
+available to them according to the fake nodes assigned as mems:
+
+ [root@...ads /exampleset/ddset]# echo $$ > tasks
+ [root@...ads /exampleset/ddset]# dd if=/dev/zero of=tmp bs=1024 count=1G
+ [1] 13425
+
+Notice the difference between the system memory usage as reported by
+/proc/meminfo between the restricted cpuset case above and the unrestricted
+case (i.e. running the same 'dd' command without assigning it to a fake NUMA
+cpuset):
+ Unrestricted Restricted
+ MemTotal: 3091900 kB 3091900 kB
+ MemFree: 42113 kB 1513236 kB
+
+This allows for coarse memory management for the tasks you assign to particular
+cpusets. Since cpusets can form a hierarchy, you can create some pretty
+interesting combinations of use-cases for various classes of tasks for your
+memory management needs.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists