lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 23 Nov 2012 14:47:11 +0800
From:	Zhouping Liu <zliu@...hat.com>
To:	Ingo Molnar <mingo@...nel.org>
CC:	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Paul Turner <pjt@...gle.com>,
	Lee Schermerhorn <Lee.Schermerhorn@...com>,
	Christoph Lameter <cl@...ux.com>,
	Rik van Riel <riel@...hat.com>, Mel Gorman <mgorman@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Johannes Weiner <hannes@...xchg.org>,
	Hugh Dickins <hughd@...gle.com>
Subject: Re: [PATCH 00/33] Latest numa/core release, v17

On 11/23/2012 06:53 AM, Ingo Molnar wrote:
>
> * Ingo Molnar <mingo@...nel.org> wrote:
>
>> This release mainly addresses one of the regressions Linus
>> (rightfully) complained about: the "4x JVM" SPECjbb run.
>>
>> [ Note to testers: if possible please still run with
>>    CONFIG_TRANSPARENT_HUGEPAGES=y enabled, to avoid the
>>    !THP regression that is still not fully fixed.
>>    It will be fixed next. ]
> I forgot to include the Git link:
>
>    git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master

Hi Ingo, I tested the latest tip/master tree on 2 nodes with 8 
processors, closed CONFIG_TRANSPARENT_HUGEPAGE,
and found some issues:

one is that command `stress -i 20 -m 30 -v` caused some hung tasks:

------------- snip ---------------------------
[ 1726.278382] Node 0 DMA free:15880kB min:20kB low:24kB high:28kB 
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15332kB 
mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB 
slab_reclaimable:0kB slab_unreclaimable:16kB kernel_stack:0kB 
pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB 
pages_scanned:0 all_unreclaimable? yes
[ 1726.366825] lowmem_reserve[]: 0 1957 3973 3973
[ 1726.388610] Node 0 DMA32 free:10856kB min:2796kB low:3492kB 
high:4192kB active_anon:1479384kB inactive_anon:498788kB active_file:0kB 
inactive_file:8kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
present:2004184kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB 
shmem:20kB slab_reclaimable:140kB slab_unreclaimable:96kB 
kernel_stack:8kB pagetables:80kB unstable:0kB bounce:0kB free_cma:0kB 
writeback_tmp:0kB pages_scanned:3502066 all_unreclaimable? yes
[ 1726.490163] lowmem_reserve[]: 0 0 2016 2016
[ 1726.515235] Node 0 Normal free:2880kB min:2880kB low:3600kB 
high:4320kB active_anon:1453776kB inactive_anon:490196kB 
active_file:748kB inactive_file:1140kB unevictable:3740kB 
isolated(anon):0kB isolated(file):0kB present:2064384kB mlocked:3492kB 
dirty:0kB writeback:0kB mapped:2748kB shmem:2116kB 
slab_reclaimable:9260kB slab_unreclaimable:35880kB kernel_stack:1184kB 
pagetables:3308kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB 
pages_scanned:3437106 all_unreclaimable? yes
[ 1726.629591] lowmem_reserve[]: 0 0 0 0
[ 1726.657650] Node 1 Normal free:5748kB min:5760kB low:7200kB 
high:8640kB active_anon:3383776kB inactive_anon:682376kB active_file:8kB 
inactive_file:340kB unevictable:8kB isolated(anon):384kB 
isolated(file):0kB present:4128768kB mlocked:8kB dirty:0kB writeback:0kB 
mapped:20kB shmem:12kB slab_reclaimable:9364kB 
slab_unreclaimable:13728kB kernel_stack:880kB pagetables:12492kB 
unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB 
pages_scanned:8696274 all_unreclaimable? yes
[ 1726.782732] lowmem_reserve[]: 0 0 0 0
[ 1726.814748] Node 0 DMA: 2*4kB 2*8kB 1*16kB 1*32kB 3*64kB 2*128kB 
0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15880kB
[ 1726.854951] Node 0 DMA32: 12*4kB 25*8kB 21*16kB 19*32kB 15*64kB 
8*128kB 4*256kB 1*512kB 0*1024kB 1*2048kB 1*4096kB = 10856kB
[ 1726.896378] Node 0 Normal: 556*4kB 11*8kB 6*16kB 7*32kB 2*64kB 
1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2888kB
[ 1726.937928] Node 1 Normal: 392*4kB 22*8kB 1*16kB 0*32kB 0*64kB 
0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 5856kB
[ 1726.979519] 481162 total pagecache pages
[ 1727.013687] 479540 pages in swap cache
[ 1727.047898] Swap cache stats: add 2176770, delete 1697230, find 
701489/781867
[ 1727.085709] Free swap  = 4371040kB
[ 1727.119839] Total swap = 8175612kB
[ 1727.187872] 2097136 pages RAM
[ 1727.221789] 56226 pages reserved
[ 1727.256273] 1721904 pages shared
[ 1727.290826] 1549555 pages non-shared
[ 1727.325708] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents 
oom_score_adj name
[ 1727.366026] [  303]     0   303     9750      184      23 500         
-1000 systemd-udevd
[ 1727.407347] [  448]     0   448     6558      178      13 56         
-1000 auditd
[ 1727.447970] [  516]     0   516    61311      172      22 
105             0 rsyslogd
[ 1727.488827] [  519]     0   519    24696       35      15 
35             0 iscsiuio
[ 1727.529839] [  529]    81   529     8201      246      27 93          
-900 dbus-daemon
[ 1727.571553] [  530]     0   530     1264       93       9 
16             0 iscsid
[ 1727.612795] [  531]     0   531     1389      843       9 0           
-17 iscsid
[ 1727.654069] [  538]     0   538     1608      157      10 
29             0 mcelog
[ 1727.695659] [  543]     0   543     5906      138      16 
49             0 atd
[ 1727.736954] [  551]     0   551    30146      211      22 
123             0 crond
[ 1727.778589] [  569]     0   569    26612      223      83 
185             0 login
[ 1727.820932] [  587]     0   587    86085      334      84 
139             0 NetworkManager
[ 1727.863388] [  643]     0   643    21931      200      65 
171          -900 modem-manager
[ 1727.906015] [  644]     0   644     5812      183      16 
74             0 bluetoothd
[ 1727.948629] [  672]     0   672   118364      228     179 
1079             0 libvirtd
[ 1727.991230] [  691]     0   691    20059      189      40 198         
-1000 sshd
[ 1728.033700] [  996]     0   996    28950      196      17 
516             0 bash
[ 1728.076541] [ 1056]     0  1056     1803      112      10 
22             0 stress
[ 1728.119495] [ 1057]     0  1057     1803       27      10 
21             0 stress
[ 1728.162616] [ 1058]     0  1058    67340    54961     118 
12             0 stress
[ 1728.205521] [ 1059]     0  1059     1803       27      10 
21             0 stress
[ 1728.248414] [ 1060]     0  1060    67340    12209      35 
12             0 stress
[ 1728.291529] [ 1061]     0  1061     1803       27      10 
21             0 stress
[ 1728.335147] [ 1062]     0  1062    67340    23519      57 
12             0 stress
[ 1728.378537] [ 1063]     0  1063     1803       27      10 
21             0 stress
[ 1728.421877] [ 1064]     0  1064    67340     7673     138 
57944             0 stress
[ 1728.465325] [ 1065]     0  1065     1803       27      10 
21             0 stress
[ 1728.508955] [ 1066]     0  1066    67340       90      11 
12             0 stress
[ 1728.553187] [ 1067]     0  1067     1803       27      10 
21             0 stress
[ 1728.597554] [ 1068]     0  1068    67340    58628     126 
12             0 stress
[ 1728.640668] [ 1069]     0  1069     1803       27      10 
21             0 stress
[ 1728.683676] [ 1070]     0  1070    67340    59802     128 
12             0 stress
[ 1728.726534] [ 1071]     0  1071     1803       27      10 
21             0 stress
[ 1728.769082] [ 1072]     0  1072    67340     5924     138 
59693             0 stress
[ 1728.811455] [ 1073]     0  1073     1803       27      10 
21             0 stress
[ 1728.852798] [ 1074]     0  1074    67340    65103     138 
14             0 stress
[ 1728.892605] [ 1075]     0  1075     1803       27      10 
21             0 stress
[ 1728.931191] [ 1076]     0  1076    67340    60077     128 
13             0 stress
[ 1728.969491] [ 1077]     0  1077     1803       27      10 
21             0 stress
[ 1729.006394] [ 1078]     0  1078    67340    13262     138 
52355             0 stress
[ 1729.042189] [ 1079]     0  1079     1803       27      10 
21             0 stress
[ 1729.076890] [ 1080]     0  1080    67340    38640      87 
12             0 stress
[ 1729.111443] [ 1081]     0  1081     1803       27      10 
21             0 stress
[ 1729.144638] [ 1082]     0  1082    67340     8238     138 
57379             0 stress
[ 1729.176403] [ 1083]     0  1083     1803       27      10 
21             0 stress
[ 1729.206905] [ 1084]     0  1084    67340    55392     119 
12             0 stress
[ 1729.237086] [ 1085]     0  1085     1803       27      10 
21             0 stress
[ 1729.265883] [ 1086]     0  1086    67340     4169     138 
61447             0 stress
[ 1729.293362] [ 1087]     0  1087     1803       27      10 
21             0 stress
[ 1729.319405] [ 1088]     0  1088    67340    16042      42 
12             0 stress
[ 1729.345380] [ 1089]     0  1089     1803       27      10 
21             0 stress
[ 1729.370934] [ 1090]     0  1090    67340     1223      13 
12             0 stress
[ 1729.395553] [ 1091]     0  1091     1803       27      10 
21             0 stress
[ 1729.419544] [ 1092]     0  1092    67340     8318     138 
57298             0 stress
[ 1729.443863] [ 1093]     0  1093     1803       27      10 
21             0 stress
[ 1729.467471] [ 1094]     0  1094    67340     2342      16 
12             0 stress
[ 1729.491074] [ 1095]     0  1095     1803       27      10 
21             0 stress
[ 1729.514194] [ 1096]     0  1096    67340    59017     126 
12             0 stress
[ 1729.536998] [ 1097]     0  1097    67340    36245      82 
12             0 stress
[ 1729.559710] [ 1098]     0  1098    67340    57050     122 
12             0 stress
[ 1729.582264] [ 1099]     0  1099    67340    29239      68 
12             0 stress
[ 1729.604895] [ 1100]     0  1100    67340    30815      71 
12             0 stress
[ 1729.627532] [ 1101]     0  1101    67340     6881     138 
58735             0 stress
[ 1729.650016] [ 1102]     0  1102    67340    37447      84 
12             0 stress
[ 1729.672130] [ 1103]     0  1103    67340     6891      24 
12             0 stress
[ 1729.693897] [ 1104]     0  1104    67340    35463      80 
12             0 stress
[ 1729.715565] [ 1105]     0  1105    67340    11843      34 
12             0 stress
[ 1729.736992] [ 1106]     0  1106    67340    10279     138 
55338             0 stress
[ 1729.758383] [ 1198]     0  1198    88549     5957     185 
6739          -900 setroubleshootd
[ 1729.780776] [ 2309]     0  2309     3243      192      12 
0             0 systemd-cgroups
[ 1729.803176] [ 2312]     0  2312     3243      179      12 
0             0 systemd-cgroups
[ 1729.825560] [ 2314]     0  2314     3243      209      11 
0             0 systemd-cgroups
[ 1729.847848] [ 2315]     0  2315     3243      165      11 
0             0 systemd-cgroups
[ 1729.870223] [ 2317]     0  2317     1736       41       6 
0             0 systemd-cgroups
[ 1729.892316] [ 2319]     0  2319     2688       46       7 
0             0 systemd-cgroups
[ 1729.914310] [ 2320]     0  2320      681       34       4 
0             0 systemd-cgroups
[ 1729.936223] [ 2321]     0  2321       42        1       2 
0             0 systemd-cgroups
[ 1729.957811] Out of memory: Kill process 516 (rsyslogd) score 0 or 
sacrifice child
[ 1729.978407] Killed process 516 (rsyslogd) total-vm:245244kB, 
anon-rss:0kB, file-rss:688kB
[ 1923.469572] INFO: task kworker/4:2:232 blocked for more than 120 seconds.
[ 1923.490002] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[ 1923.511856] kworker/4:2     D ffff88027fc14080     0   232      2 
0x00000000
[ 1923.533216]  ffff88027977db88 0000000000000046 ffff8802797b5040 
ffff88027977dfd8
[ 1923.556100]  ffff88027977dfd8 ffff88027977dfd8 ffff880179063580 
ffff8802797b5040
[ 1923.578569]  ffff88027977db68 ffff88027977dd10 7fffffffffffffff 
ffff8802797b5040
[ 1923.601034] Call Trace:
[ 1923.618253]  [<ffffffff815e7b69>] schedule+0x29/0x70
[ 1923.638372]  [<ffffffff815e6114>] schedule_timeout+0x1f4/0x2b0
[ 1923.659539]  [<ffffffff815e79f0>] wait_for_common+0x120/0x170
[ 1923.680664]  [<ffffffff81096920>] ? try_to_wake_up+0x2d0/0x2d0
[ 1923.702224]  [<ffffffff815e7b3d>] wait_for_completion+0x1d/0x20
[ 1923.723820]  [<ffffffff8107b0b9>] call_usermodehelper_fns+0x1d9/0x200
[ 1923.746167]  [<ffffffff810d0b32>] cgroup_release_agent+0xe2/0x180
[ 1923.768233]  [<ffffffff8107e638>] process_one_work+0x148/0x490
[ 1923.790179]  [<ffffffff810d0a50>] ? init_root_id+0xb0/0xb0
[ 1923.811797]  [<ffffffff8107f16e>] worker_thread+0x15e/0x450
[ 1923.833733]  [<ffffffff8107f010>] ? busy_worker_rebind_fn+0x110/0x110
[ 1923.856703]  [<ffffffff81084350>] kthread+0xc0/0xd0
[ 1923.878067]  [<ffffffff81084290>] ? kthread_create_on_node+0x120/0x120
[ 1923.901388]  [<ffffffff815f12ac>] ret_from_fork+0x7c/0xb0
[ 1923.923544]  [<ffffffff81084290>] ? kthread_create_on_node+0x120/0x120
[ 1923.947290] INFO: task rs:main Q:Reg:534 blocked for more than 120 
seconds.
[ 1923.971646] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[ 1923.997338] rs:main Q:Reg   D ffff88027fc54080     0   534      1 
0x00000084
[ 1924.022565]  ffff880279913930 0000000000000082 ffff880279b29ac0 
ffff880279913fd8
[ 1924.049053]  ffff880279913fd8 ffff880279913fd8 ffff880279b2b580 
ffff880279b29ac0
[ 1924.075230]  000000000003b55f ffff880279b29ac0 ffff880279bedde0 
ffffffffffffffff
[ 1924.101476] Call Trace:
[ 1924.122766]  [<ffffffff815e7b69>] schedule+0x29/0x70
[ 1924.146894]  [<ffffffff815e8915>] rwsem_down_failed_common+0xb5/0x140
[ 1924.173018]  [<ffffffff815e89d5>] rwsem_down_read_failed+0x15/0x17
[ 1924.198572]  [<ffffffff812eb0c4>] call_rwsem_down_read_failed+0x14/0x30
[ 1924.224975]  [<ffffffff810fecf3>] ? taskstats_exit+0x383/0x420
[ 1924.250579]  [<ffffffff812e9e5f>] ? __get_user_8+0x1f/0x29
[ 1924.275797]  [<ffffffff815e6df4>] ? down_read+0x24/0x2b
[ 1924.300955]  [<ffffffff815eca16>] __do_page_fault+0x1c6/0x4e0
[ 1924.326726]  [<ffffffff811797ef>] ? alloc_pages_current+0xcf/0x140
[ 1924.353121]  [<ffffffff8118345e>] ? new_slab+0x20e/0x310
[ 1924.378729]  [<ffffffff815ecd3e>] do_page_fault+0xe/0x10
[ 1924.404424]  [<ffffffff815e9358>] page_fault+0x28/0x30
[ 1924.429991]  [<ffffffff810fecf3>] ? taskstats_exit+0x383/0x420
[ 1924.456539]  [<ffffffff812e9e5f>] ? __get_user_8+0x1f/0x29
[ 1924.482746]  [<ffffffff810c047d>] ? exit_robust_list+0x5d/0x160
[ 1924.509604]  [<ffffffff810feca9>] ? taskstats_exit+0x339/0x420
[ 1924.536514]  [<ffffffff8105e5d7>] mm_release+0x147/0x160
[ 1924.563242]  [<ffffffff81065186>] exit_mm+0x26/0x120
[ 1924.589384]  [<ffffffff81066787>] do_exit+0x167/0x8d0
[ 1924.615888]  [<ffffffff810be75b>] ? futex_wait+0x13b/0x2c0
[ 1924.642809]  [<ffffffff81183060>] ? kmem_cache_free+0x20/0x160
[ 1924.670213]  [<ffffffff8106733f>] do_group_exit+0x3f/0xa0
[ 1924.697387]  [<ffffffff81075eca>] get_signal_to_deliver+0x1ca/0x5e0
[ 1924.726012]  [<ffffffff8101437f>] do_signal+0x3f/0x610
[ 1924.753260]  [<ffffffff810c06ad>] ? do_futex+0x12d/0x580
[ 1924.780688]  [<ffffffff810149f0>] do_notify_resume+0x80/0xb0
[ 1924.808394]  [<ffffffff815f1612>] int_signal+0x12/0x17
[ 1924.835765] INFO: task rsyslogd:536 blocked for more than 120 seconds.
[ 1924.864967] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[ 1924.896012] rsyslogd        D ffff88017bc14080     0   536      1 
0x00000086
[ 1924.926767]  ffff8802791b7c38 0000000000000082 ffff88027a0a0000 
ffff8802791b7fd8
[ 1924.958361]  ffff8802791b7fd8 ffff8802791b7fd8 ffff880279b29ac0 
ffff88027a0a0000
[ 1924.990164]  ffff8802791b7c28 ffff88027a0a0000 ffff880279bedde0 
ffffffffffffffff
[ 1925.021929] Call Trace:
[ 1925.048178]  [<ffffffff815e7b69>] schedule+0x29/0x70
[ 1925.077075]  [<ffffffff815e8915>] rwsem_down_failed_common+0xb5/0x140
[ 1925.107125]  [<ffffffff815e89b3>] rwsem_down_write_failed+0x13/0x20
[ 1925.136665]  [<ffffffff812eb0f3>] call_rwsem_down_write_failed+0x13/0x20
[ 1925.166430]  [<ffffffff815e6dc2>] ? down_write+0x32/0x40
[ 1925.194797]  [<ffffffff8109ae0b>] task_numa_work+0xeb/0x270
[ 1925.223254]  [<ffffffff81081047>] task_work_run+0xa7/0xe0
[ 1925.251609]  [<ffffffff810762bb>] get_signal_to_deliver+0x5bb/0x5e0
[ 1925.280624]  [<ffffffff8115e619>] ? handle_mm_fault+0x149/0x210
[ 1925.309222]  [<ffffffff8101437f>] do_signal+0x3f/0x610
[ 1925.337048]  [<ffffffff8117cea1>] ? change_prot_numa+0x51/0x60
[ 1925.365604]  [<ffffffff8109aef6>] ? task_numa_work+0x1d6/0x270
[ 1925.394295]  [<ffffffff810149f0>] do_notify_resume+0x80/0xb0
[ 1925.422677]  [<ffffffff815e917c>] retint_signal+0x48/0x8c
[ 1925.450753] INFO: task NetworkManager:587 blocked for more than 120 
seconds.
[ 1925.480882] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[ 1925.512086] NetworkManager  D ffff88027fc54080     0   587      1 
0x00000080
[ 1925.542751]  ffff880279fb7dc8 0000000000000086 ffff880279f29ac0 
ffff880279fb7fd8
[ 1925.574086]  ffff880279fb7fd8 ffff880279fb7fd8 ffff8801752c1ac0 
ffff880279f29ac0
[ 1925.605328]  ffffea0005e82c40 ffff880279f29ac0 ffff8801793cd860 
ffffffffffffffff
[ 1925.636585] Call Trace:
[ 1925.662557]  [<ffffffff815e7b69>] schedule+0x29/0x70
[ 1925.691518]  [<ffffffff815e8915>] rwsem_down_failed_common+0xb5/0x140
[ 1925.721815]  [<ffffffff8115e619>] ? handle_mm_fault+0x149/0x210
[ 1925.751814]  [<ffffffff815e89b3>] rwsem_down_write_failed+0x13/0x20
[ 1925.781475]  [<ffffffff812eb0f3>] call_rwsem_down_write_failed+0x13/0x20
[ 1925.811421]  [<ffffffff815e6dc2>] ? down_write+0x32/0x40
[ 1925.839679]  [<ffffffff8109ae0b>] task_numa_work+0xeb/0x270
[ 1925.868180]  [<ffffffff810e307c>] ? __audit_syscall_exit+0x3ec/0x450
[ 1925.897544]  [<ffffffff81081047>] task_work_run+0xa7/0xe0
[ 1925.925674]  [<ffffffff810149e1>] do_notify_resume+0x71/0xb0
[ 1925.954159]  [<ffffffff815e917c>] retint_signal+0x48/0x8c
[ 2045.984126] INFO: task kworker/4:2:232 blocked for more than 120 seconds.
[ 2046.013951] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[ 2046.045339] kworker/4:2     D ffff88027fc14080     0   232      2 
0x00000000
[ 2046.075873]  ffff88027977db88 0000000000000046 ffff8802797b5040 
ffff88027977dfd8
[ 2046.107357]  ffff88027977dfd8 ffff88027977dfd8 ffff880179063580 
ffff8802797b5040
[ 2046.138576]  ffff88027977db68 ffff88027977dd10 7fffffffffffffff 
ffff8802797b5040
[ 2046.169696] Call Trace:
[ 2046.195647]  [<ffffffff815e7b69>] schedule+0x29/0x70
[ 2046.224276]  [<ffffffff815e6114>] schedule_timeout+0x1f4/0x2b0
[ 2046.253957]  [<ffffffff815e79f0>] wait_for_common+0x120/0x170
[ 2046.283497]  [<ffffffff81096920>] ? try_to_wake_up+0x2d0/0x2d0
[ 2046.313349]  [<ffffffff815e7b3d>] wait_for_completion+0x1d/0x20
[ 2046.343061]  [<ffffffff8107b0b9>] call_usermodehelper_fns+0x1d9/0x200
[ 2046.373636]  [<ffffffff810d0b32>] cgroup_release_agent+0xe2/0x180
[ 2046.403705]  [<ffffffff8107e638>] process_one_work+0x148/0x490
[ 2046.433485]  [<ffffffff810d0a50>] ? init_root_id+0xb0/0xb0
[ 2046.462879]  [<ffffffff8107f16e>] worker_thread+0x15e/0x450
[ 2046.492398]  [<ffffffff8107f010>] ? busy_worker_rebind_fn+0x110/0x110
[ 2046.522898]  [<ffffffff81084350>] kthread+0xc0/0xd0
[ 2046.551810]  [<ffffffff81084290>] ? kthread_create_on_node+0x120/0x120
[ 2046.582425]  [<ffffffff815f12ac>] ret_from_fork+0x7c/0xb0
[ 2046.612099]  [<ffffffff81084290>] ? kthread_create_on_node+0x120/0x120
----------------------- snip ----------------------------

the other is that oom02(LTP: testcases/kernel/mem/oom/oom02) made 
oom-killer kill unexpected processes:
(oom02 is designed to try to hog memory on one node), and oom02 always 
hung until you kill it manually.

------------ snip -----------------------
[12449.554508] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents 
oom_score_adj name
[12449.554524] [  303]     0   303     9750      296      23 414         
-1000 systemd-udevd
[12449.554527] [  448]     0   448     6558      224      13 35         
-1000 auditd
[12449.554532] [  519]     0   519    24696       82      15 
20             0 iscsiuio
[12449.554534] [  529]    81   529     8201      389      27 35          
-900 dbus-daemon
[12449.554536] [  530]     0   530     1264       94       9 
15             0 iscsid
[12449.554538] [  531]     0   531     1389      876       9 0           
-17 iscsid
[12449.554540] [  538]     0   538     1608      158      10 
28             0 mcelog
[12449.554542] [  543]     0   543     5906      138      16 
48             0 atd
[12449.554543] [  551]     0   551    30146      301      22 
56             0 crond
[12449.554545] [  569]     0   569    26612      224      83 
184             0 login
[12449.554547] [  587]     0   587   108127      917      91 
45             0 NetworkManager
[12449.554549] [  643]     0   643    21931      290      65 
120          -900 modem-manager
[12449.554551] [  644]     0   644     5812      222      16 
61             0 bluetoothd
[12449.554553] [  672]     0   672   118388      469     179 
935             0 libvirtd
[12449.554555] [  691]     0   691    20059      229      40 177         
-1000 sshd
[12449.554556] [  996]     0   996    28975      654      18 
186             0 bash
[12449.554558] [ 1198]     0  1198    88549     5958     185 
6738          -900 setroubleshootd
[12449.554563] [ 2332]     0  2332    74107     3729     152 
0             0 systemd-journal
[12449.554564] [ 2335]     0  2335     8222      388      20 
0             0 systemd-logind
[12449.554567] [20909]     0 20909    61311      456      22 
0             0 rsyslogd
[12449.554572] [12818]     0 12818   788031     6273      24 
0             0 oom02
[12449.554574] [13193]     0 13193    23143     3749      46 
0             0 dhclient
[12449.554576] [13217]     0 13217    33221     1188      67 
0             0 sshd
[12449.554577] [13221]     0 13221    28949      879      19 
0             0 bash
[12449.554579] [13308]     0 13308    23143     3749      45 
0             0 dhclient
[12449.554581] [13388]     0 13388    36864     1063      45 
0             0 vim
[12449.554583] Out of memory: Kill process 13193 (dhclient) score 1 or 
sacrifice child
[12449.554584] Killed process 13193 (dhclient) total-vm:92572kB, 
anon-rss:11976kB, file-rss:3020kB
[12451.878812] oom02 invoked oom-killer: gfp_mask=0x280da, order=0, 
oom_score_adj=0
[12451.878815] oom02 cpuset=/ mems_allowed=0-1
[12451.878818] Pid: 12818, comm: oom02 Tainted: G        W 
3.7.0-rc6numacorev17+ #5
[12451.878819] Call Trace:
[12451.878829]  [<ffffffff810d92d1>] ? 
cpuset_print_task_mems_allowed+0x91/0xa0
[12451.878834]  [<ffffffff815dd3b6>] dump_header.isra.12+0x70/0x19b
[12451.878838]  [<ffffffff812e37d3>] ? ___ratelimit+0xa3/0x120
[12451.878843]  [<ffffffff81139c0d>] oom_kill_process+0x1cd/0x320
[12451.878848]  [<ffffffff8106d3c5>] ? has_ns_capability_noaudit+0x15/0x20
[12451.878850]  [<ffffffff8113a377>] out_of_memory+0x447/0x480
[12451.878853]  [<ffffffff8113ff5c>] __alloc_pages_nodemask+0x94c/0x960
[12451.878858]  [<ffffffff8117b1a6>] alloc_pages_vma+0xb6/0x190
[12451.878861]  [<ffffffff8115e094>] handle_pte_fault+0x8f4/0xb90
[12451.878865]  [<ffffffff810fa237>] ? call_rcu_sched+0x17/0x20
[12451.878868]  [<ffffffff8118ed82>] ? put_filp+0x52/0x60
[12451.878870]  [<ffffffff8115e619>] handle_mm_fault+0x149/0x210
[12451.878873]  [<ffffffff815ec9c2>] __do_page_fault+0x172/0x4e0
[12451.878875]  [<ffffffff81183060>] ? kmem_cache_free+0x20/0x160
[12451.878878]  [<ffffffff81198396>] ? final_putname+0x26/0x50
[12451.878880]  [<ffffffff815ecd3e>] do_page_fault+0xe/0x10
[12451.878883]  [<ffffffff815e9358>] page_fault+0x28/0x30
--------------- snip ------------------

-------------- snip ------------------
[12451.956997] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents 
oom_score_adj name
[12451.957010] [  303]     0   303     9750      296      23 414         
-1000 systemd-udevd
[12451.957013] [  448]     0   448     6558      224      13 35         
-1000 auditd
[12451.957017] [  519]     0   519    24696       82      15 
20             0 iscsiuio
[12451.957019] [  529]    81   529     8201      389      27 35          
-900 dbus-daemon
[12451.957021] [  530]     0   530     1264       94       9 
15             0 iscsid
[12451.957022] [  531]     0   531     1389      876       9 0           
-17 iscsid
[12451.957024] [  538]     0   538     1608      158      10 
28             0 mcelog
[12451.957026] [  543]     0   543     5906      138      16 
48             0 atd
[12451.957028] [  551]     0   551    30146      301      22 
56             0 crond
[12451.957029] [  569]     0   569    26612      224      83 
184             0 login
[12451.957031] [  587]     0   587   110177      922      92 
45             0 NetworkManager
[12451.957033] [  643]     0   643    21931      290      65 
120          -900 modem-manager
[12451.957035] [  644]     0   644     5812      222      16 
61             0 bluetoothd
[12451.957036] [  672]     0   672   118388      469     179 
935             0 libvirtd
[12451.957038] [  691]     0   691    20059      229      40 177         
-1000 sshd
[12451.957040] [  996]     0   996    28975      654      18 
186             0 bash
[12451.957042] [ 1198]     0  1198    88549     5958     185 
6738          -900 setroubleshootd
[12451.957045] [ 2332]     0  2332    74107     3897     152 
0             0 systemd-journal
[12451.957047] [ 2335]     0  2335     8222      388      20 
0             0 systemd-logind
[12451.957049] [20909]     0 20909    61311      456      22 
0             0 rsyslogd
[12451.957052] [12818]     0 12818   788031     6273      24 
0             0 oom02
[12451.957054] [13217]     0 13217    33221     1188      67 
0             0 sshd
[12451.957055] [13221]     0 13221    28949      879      19 
0             0 bash
[12451.957057] [13388]     0 13388    36864     1063      45 
0             0 vim
[12451.957059] [13410]     0 13410    33476      510      58 0          
-900 nm-dispatcher.a
[12451.957061] Out of memory: Kill process 2335 (systemd-logind) score 0 
or sacrifice child
-------------------- snip ----------------------------

also I found oom-killer performed bad on numa/core tree, you can use 
LTP: testcases/kernel/mem/oom/oom* to verify it.

please let me know if you need more details or any other testing works.

Thanks,
Zhouping
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ