[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20081203141534.39d1fc28.kamezawa.hiroyu@jp.fujitsu.com>
Date: Wed, 3 Dec 2008 14:15:34 +0900
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc: "linux-mm@...ck.org" <linux-mm@...ck.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"balbir@...ux.vnet.ibm.com" <balbir@...ux.vnet.ibm.com>,
"nishimura@....nes.nec.co.jp" <nishimura@....nes.nec.co.jp>,
"kosaki.motohiro@...fujitsu.com" <kosaki.motohiro@...fujitsu.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: [PATCH 22/21] memcg-explain-details-and-test-document.patch
just passed spell check. sorry for 22/21.
==
Documentation for implementation details and how to test.
just an example. feel free to modify, add, remove lines.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Documentation/controllers/memcg_test.txt | 145 +++++++++++++++++++++++++++++++
1 file changed, 145 insertions(+)
Index: mmotm-2.6.28-Dec02/Documentation/controllers/memcg_test.txt
===================================================================
--- /dev/null
+++ mmotm-2.6.28-Dec02/Documentation/controllers/memcg_test.txt
@@ -0,0 +1,145 @@
+Memory Resource Controller(Memcg) Implementation Memo.
+Last Updated: 2009/12/03
+
+Because VM is getting complex (one of reasons is memcg...), memcg's behavior
+is complex. This is a document for memcg's internal behavior and some test
+patterns tend to be racy.
+
+1. charges
+
+ a page/swp_entry may be charged (usage += PAGE_SIZE) at
+
+ mem_cgroup_newpage_newpage()
+ called at new page fault and COW.
+
+ mem_cgroup_try_charge_swapin()
+ called at do_swap_page() and swapoff.
+ followed by charge-commit-cancel protocol.
+ (With swap accounting) at commit, charges recorded in swap is removed.
+
+ mem_cgroup_cache_charge()
+ called at add_to_page_cache()
+
+ mem_cgroup_cache_charge_swapin)()
+ called by shmem's swapin processing.
+
+ mem_cgroup_prepare_migration()
+ called before migration. "extra" charge is done
+ followed by charge-commit-cancel protocol.
+ At commit, charge against oldpage or newpage will be committed.
+
+2. uncharge
+ a page/swp_entry may be uncharged (usage -= PAGE_SIZE) by
+
+ mem_cgroup_uncharge_page()
+ called when an anonymous page is unmapped. If the page is SwapCache
+ uncharge is delayed until mem_cgroup_uncharge_swapcache().
+
+ mem_cgroup_uncharge_cache_page()
+ called when a page-cache is deleted from radix-tree. If the page is
+ SwapCache, uncharge is delayed until mem_cgroup_uncharge_swapcache()
+
+ mem_cgroup_uncharge_swapcache()
+ called when SwapCache is removed from radix-tree. the charge itself
+ is moved to swap_cgroup. (If mem+swap controller is disabled, no
+ charge to swap.)
+
+ mem_cgroup_uncharge_swap()
+ called when swp_entry's refcnt goes down to be 0. charge against swap
+ disappears.
+
+ mem_cgroup_end_migration(old, new)
+ at success of migration -> old is uncharged (if necessary), charge
+ to new is committed. at failure, charge to old is committed.
+
+3. charge-commit-cancel
+ In some case, we can't know this "charge" is valid or not at charge.
+ To handle such case, there are charge-commit-cancel functions.
+ mem_cgroup_try_charge_XXX
+ mem_cgroup_commit_charge_XXX
+ mem_cgroup_cancel_charge_XXX
+ these are used in swap-in and migration.
+
+ At try_charge(), there are no flags to say "this page is charged".
+ at this point, usage += PAGE_SIZE.
+
+ At commit(), the function checks the page should be charged or not
+ and set flags or avoid charging.(usage -= PAGE_SIZE)
+
+ At cancel(), simply usage -= PAGE_SIZE.
+
+4. Typical Tests.
+
+ Tests for racy cases.
+
+ 4.1 small limit to memcg.
+ When you do test to do racy case, it's good test to set memcg's limit
+ to be very small rather than GB. Many races found in the test under
+ xKB or xxMB limits.
+ (Memory behavior under GB and Memory behavior under MB shows very
+ different situation.)
+
+ 4.2 shmem
+ Historically, memcg's shmem handling was poor and we saw some amount
+ of troubles here. This is because shmem is page-cache but can be
+ SwapCache. Test with shmem/tmpfs is always good test.
+
+ 4.3 migration
+ For NUMA, migration is an another special. To do easy test, cpuset
+ is useful. Following is a sample script to do migration.
+
+ mount -t cgroup -o cpuset none /opt/cpuset
+
+ mkdir /opt/cpuset/01
+ echo 1 > /opt/cpuset/01/cpuset.cpus
+ echo 0 > /opt/cpuset/01/cpuset.mems
+ echo 1 > /opt/cpuset/01/cpuset.memory_migrate
+ mkdir /opt/cpuset/02
+ echo 1 > /opt/cpuset/02/cpuset.cpus
+ echo 1 > /opt/cpuset/02/cpuset.mems
+ echo 1 > /opt/cpuset/02/cpuset.memory_migrate
+
+ In above set, when you moves a task from 01 to 02, page migration to
+ node 0 to node 1 will occur. Following is a script to migrate all
+ under cpuset.
+ --
+ move_task()
+ {
+ for pid in $1
+ do
+ /bin/echo $pid >$2/tasks 2>/dev/null
+ echo -n $pid
+ echo -n " "
+ done
+ echo END
+ }
+
+ G1_TASK=`cat ${G1}/tasks`
+ G2_TASK=`cat ${G2}/tasks`
+ move_task "${G1_TASK}" ${G2} &
+ --
+ 4.4 memory hotplug.
+ memory hotplug test is one of good test.
+ to offline memory, do following.
+ # echo offline > /sys/devices/system/memory/memoryXXX/state
+ (XXX is the place of memory)
+ This is an easy way to test page migration, too.
+
+ 4.5 mkdir/rmdir
+ When using hierarchy, mkdir/rmdir test should be done.
+ tests like following.
+
+ #echo 1 >/opt/cgroup/01/memory/use_hierarchy
+ #mkdir /opt/cgroup/01/child_a
+ #mkdir /opt/cgroup/01/child_b
+
+ set limit to 01.
+ add limit to 01/child_b
+ run jobs under child_a and child_b
+
+ create/delete following groups at random while jobs are running.
+ /opt/cgroup/01/child_a/child_aa
+ /opt/cgroup/01/child_b/child_bb
+ /opt/cgroup/01/child_c
+
+ running new jobs in new group is also good.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists