lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 08 Aug 2013 00:08:56 -0400
From:	Steven Rostedt <rostedt@...dmis.org>
To:	LKML <linux-kernel@...r.kernel.org>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Christoph Lameeter <cl@...ux.com>,
	Wanpeng Li <liwanp@...ux.vnet.ibm.com>,
	Pekka Enberg <penberg@...nel.org>
Subject: [BUG] hackbench locks up with perf in 3.11-rc1 and beyond

I went to do some benchmarks on the jump label code, and ran:


perf stat -r 100 ./hackbench 50

It ran twice, and then would die with:

[   65.785108] hackbench invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
[   65.792921] hackbench cpuset=/ mems_allowed=0
[   65.797286] CPU: 6 PID: 6042 Comm: hackbench Not tainted 3.11.0-rc4-test+ #26
[   65.804428] Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
[   65.813392]  0000000000000000 ffff8800105f5478 ffffffff8162024f 000000000000001e
[   65.820876]  ffff8800105f9770 ffff8800105f54f8 ffffffff8161ca6e 0000000000000000
[   65.828365]  0000000000000f48 0000000000000008 ffffffff81c375e0 ffffffff00000000
[   65.835862] Call Trace:
[   65.838317]  [<ffffffff8162024f>] dump_stack+0x46/0x58
[   65.843471]  [<ffffffff8161ca6e>] dump_header+0x7a/0x1be
[   65.848791]  [<ffffffff812ee4c3>] ? ___ratelimit+0x93/0x110
[   65.854373]  [<ffffffff8112f65b>] oom_kill_process+0x1cb/0x330
[   65.860234]  [<ffffffff8112fe20>] out_of_memory+0x470/0x4c0
[   65.865817]  [<ffffffff81135659>] __alloc_pages_nodemask+0xab9/0xad0
[   65.872178]  [<ffffffff812cadf9>] ? blk_recount_segments+0x29/0x40
[   65.878375]  [<ffffffff81173cb3>] alloc_pages_vma+0xa3/0x150
[   65.884048]  [<ffffffff8116786b>] read_swap_cache_async+0x10b/0x190
[   65.890324]  [<ffffffff8116798e>] swapin_readahead+0x9e/0xf0
[   65.895992]  [<ffffffff81154e4f>] handle_pte_fault+0x29f/0xa60
[   65.901832]  [<ffffffff81124cda>] ? __perf_sw_event+0x16a/0x190
[   65.907761]  [<ffffffff81124cda>] ? __perf_sw_event+0x16a/0x190
[   65.913689]  [<ffffffff8108d5be>] ? update_curr+0x1ee/0x200
[   65.919269]  [<ffffffff811567d6>] handle_mm_fault+0x256/0x5d0
[   65.925027]  [<ffffffff8162aa02>] __do_page_fault+0x182/0x4c0
[   65.930787]  [<ffffffff81122b56>] ? __perf_event_task_sched_in+0x196/0x1b0
[   65.937670]  [<ffffffff810819f8>] ? finish_task_switch+0xa8/0xe0
[   65.943684]  [<ffffffff81624bef>] ? __schedule+0x3bf/0x7f0
[   65.949177]  [<ffffffff8162ad4e>] do_page_fault+0xe/0x10
[   65.954495]  [<ffffffff816273f2>] page_fault+0x22/0x30
[   65.959641]  [<ffffffff812f4a09>] ? copy_user_enhanced_fast_string+0x9/0x20
[   65.966611]  [<ffffffff812fa2d7>] ? memcpy_toiovec+0x47/0x80
[   65.972286]  [<ffffffff815c81c7>] unix_stream_recvmsg+0x4e7/0x8d0
[   65.978392]  [<ffffffff81077460>] ? remove_wait_queue+0x50/0x50
[   65.984321]  [<ffffffff81512076>] sock_aio_read.part.11+0x156/0x170
[   65.990596]  [<ffffffff81124cda>] ? __perf_sw_event+0x16a/0x190
[   65.996522]  [<ffffffff815120b3>] sock_aio_read+0x23/0x30
[   66.001930]  [<ffffffff8119407a>] do_sync_read+0x7a/0xb0
[   66.007254]  [<ffffffff8119509d>] vfs_read+0x16d/0x180
[   66.012398]  [<ffffffff81195262>] SyS_read+0x52/0xa0
[   66.017369]  [<ffffffff810d6dd0>] ? __audit_syscall_exit+0x200/0x280
[   66.023728]  [<ffffffff8162f482>] system_call_fastpath+0x16/0x1b

As it always ran hackbench twice and then crashed, I changed the test to be just:

perf stat -r 10 ./hackbench 50

And kicked off ktest.pl to do the bisect. It came up with this commit as
the culprit:

commit 318df36e57c0ca9f2146660d41ff28e8650af423
Author: Joonsoo Kim <iamjoonsoo.kim@....com>
Date:   Wed Jun 19 15:33:55 2013 +0900

    slub: do not put a slab to cpu partial list when cpu_partial is 0
    
    In free path, we don't check number of cpu_partial, so one slab can
    be linked in cpu partial list even if cpu_partial is 0. To prevent
this,
    we should check number of cpu_partial in put_cpu_partial().
    
    Acked-by: Christoph Lameeter <cl@...ux.com>
    Reviewed-by: Wanpeng Li <liwanp@...ux.vnet.ibm.com>
    Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@....com>
    Signed-off-by: Pekka Enberg <penberg@...nel.org>


I reverted the commit, and sure enough, perf now can run hackbench for
all the runs I specify.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ