linux-kernel - Re: [BUG] hackbench locks up with perf in 3.11-rc1 and beyond

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 8 Aug 2013 14:04:41 +0900
From:	Joonsoo Kim <iamjoonsoo.kim@....com>
To:	Steven Rostedt <rostedt@...dmis.org>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Christoph Lameeter <cl@...ux.com>,
	Wanpeng Li <liwanp@...ux.vnet.ibm.com>,
	Pekka Enberg <penberg@...nel.org>
Subject: Re: [BUG] hackbench locks up with perf in 3.11-rc1 and beyond

On Thu, Aug 08, 2013 at 12:08:56AM -0400, Steven Rostedt wrote:
> I went to do some benchmarks on the jump label code, and ran:
> 
> 
> perf stat -r 100 ./hackbench 50
> 
> It ran twice, and then would die with:
> 
> [   65.785108] hackbench invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
> [   65.792921] hackbench cpuset=/ mems_allowed=0
> [   65.797286] CPU: 6 PID: 6042 Comm: hackbench Not tainted 3.11.0-rc4-test+ #26
> [   65.804428] Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
> [   65.813392]  0000000000000000 ffff8800105f5478 ffffffff8162024f 000000000000001e
> [   65.820876]  ffff8800105f9770 ffff8800105f54f8 ffffffff8161ca6e 0000000000000000
> [   65.828365]  0000000000000f48 0000000000000008 ffffffff81c375e0 ffffffff00000000
> [   65.835862] Call Trace:
> [   65.838317]  [<ffffffff8162024f>] dump_stack+0x46/0x58
> [   65.843471]  [<ffffffff8161ca6e>] dump_header+0x7a/0x1be
> [   65.848791]  [<ffffffff812ee4c3>] ? ___ratelimit+0x93/0x110
> [   65.854373]  [<ffffffff8112f65b>] oom_kill_process+0x1cb/0x330
> [   65.860234]  [<ffffffff8112fe20>] out_of_memory+0x470/0x4c0
> [   65.865817]  [<ffffffff81135659>] __alloc_pages_nodemask+0xab9/0xad0
> [   65.872178]  [<ffffffff812cadf9>] ? blk_recount_segments+0x29/0x40
> [   65.878375]  [<ffffffff81173cb3>] alloc_pages_vma+0xa3/0x150
> [   65.884048]  [<ffffffff8116786b>] read_swap_cache_async+0x10b/0x190
> [   65.890324]  [<ffffffff8116798e>] swapin_readahead+0x9e/0xf0
> [   65.895992]  [<ffffffff81154e4f>] handle_pte_fault+0x29f/0xa60
> [   65.901832]  [<ffffffff81124cda>] ? __perf_sw_event+0x16a/0x190
> [   65.907761]  [<ffffffff81124cda>] ? __perf_sw_event+0x16a/0x190
> [   65.913689]  [<ffffffff8108d5be>] ? update_curr+0x1ee/0x200
> [   65.919269]  [<ffffffff811567d6>] handle_mm_fault+0x256/0x5d0
> [   65.925027]  [<ffffffff8162aa02>] __do_page_fault+0x182/0x4c0
> [   65.930787]  [<ffffffff81122b56>] ? __perf_event_task_sched_in+0x196/0x1b0
> [   65.937670]  [<ffffffff810819f8>] ? finish_task_switch+0xa8/0xe0
> [   65.943684]  [<ffffffff81624bef>] ? __schedule+0x3bf/0x7f0
> [   65.949177]  [<ffffffff8162ad4e>] do_page_fault+0xe/0x10
> [   65.954495]  [<ffffffff816273f2>] page_fault+0x22/0x30
> [   65.959641]  [<ffffffff812f4a09>] ? copy_user_enhanced_fast_string+0x9/0x20
> [   65.966611]  [<ffffffff812fa2d7>] ? memcpy_toiovec+0x47/0x80
> [   65.972286]  [<ffffffff815c81c7>] unix_stream_recvmsg+0x4e7/0x8d0
> [   65.978392]  [<ffffffff81077460>] ? remove_wait_queue+0x50/0x50
> [   65.984321]  [<ffffffff81512076>] sock_aio_read.part.11+0x156/0x170
> [   65.990596]  [<ffffffff81124cda>] ? __perf_sw_event+0x16a/0x190
> [   65.996522]  [<ffffffff815120b3>] sock_aio_read+0x23/0x30
> [   66.001930]  [<ffffffff8119407a>] do_sync_read+0x7a/0xb0
> [   66.007254]  [<ffffffff8119509d>] vfs_read+0x16d/0x180
> [   66.012398]  [<ffffffff81195262>] SyS_read+0x52/0xa0
> [   66.017369]  [<ffffffff810d6dd0>] ? __audit_syscall_exit+0x200/0x280
> [   66.023728]  [<ffffffff8162f482>] system_call_fastpath+0x16/0x1b
> 
> As it always ran hackbench twice and then crashed, I changed the test to be just:
> 
> perf stat -r 10 ./hackbench 50
> 
> And kicked off ktest.pl to do the bisect. It came up with this commit as
> the culprit:
> 
> commit 318df36e57c0ca9f2146660d41ff28e8650af423
> Author: Joonsoo Kim <iamjoonsoo.kim@....com>
> Date:   Wed Jun 19 15:33:55 2013 +0900
> 
>     slub: do not put a slab to cpu partial list when cpu_partial is 0
>     
>     In free path, we don't check number of cpu_partial, so one slab can
>     be linked in cpu partial list even if cpu_partial is 0. To prevent
> this,
>     we should check number of cpu_partial in put_cpu_partial().
>     
>     Acked-by: Christoph Lameeter <cl@...ux.com>
>     Reviewed-by: Wanpeng Li <liwanp@...ux.vnet.ibm.com>
>     Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@....com>
>     Signed-off-by: Pekka Enberg <penberg@...nel.org>
> 
> 
> I reverted the commit, and sure enough, perf now can run hackbench for
> all the runs I specify.

Hello,

Sorry about it.
Now, I think that this is a buggy commit, so should be reverted.

For confirm that, could I ask a question about your configuration, Steven?
I guess, you may set 0 to all kmem caches's cpu_partial via sysfs, doesn't it?

In this case, memory leak is possible in following case.
Code flow of possible leak is follwing case.

* in __slab_free()
1. (!new.inuse || !prior) && !was_frozen
2. !kmem_cache_debug && !prior
3. new.frozen = 1
4. after cmpxchg_double_slab, run the (!n) case with new.frozen=1
5. with this patch, put_cpu_partial() doesn't do anything,
	because this cache's cpu_partial is 0
6. return

In step 5, leak occur.

I have a solution to prevent this problem, but in this stage, IMHO,
reverting it may be better.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/