linux-kernel - Re: RCU callbacks and TREE_PREEMPT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090916152929.GA6737@linux.vnet.ibm.com>
Date:	Wed, 16 Sep 2009 08:29:29 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Catalin Marinas <catalin.marinas@....com>
Cc:	Eric Sesterhenn <eric.sesterhenn@...xperts.de>,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: RCU callbacks and TREE_PREEMPT_RCU

On Wed, Sep 16, 2009 at 03:17:21PM +0100, Catalin Marinas wrote:
> Hi Paul,
> 
> Eric was reporting some issues with kmemleak on 2.6.31 accessing freed
> memory under heavy stress (using the "stress" application). Basically,
> the system gets into an oom state (because of "stress -m 1000") and
> kmemleak fails to allocate its metadata (correct behaviour so far). At
> that point, it disables itself and schedules the clean-up work which
> does this (among other locking, the kmemleak_do_cleanup function the
> latest mainline):
> 
> 	rcu_read_lock();
> 	list_for_each_entry_rcu(object, &object_list, object_list)
> 		delete_object_full(object->pointer);
> 	rcu_read_unlock();
> 
> The kmemleak objects are freed via put_object() with:
> 
> 	call_rcu(&object->rcu, free_object_rcu);
> 
> (the free_object_rcu calls kmem_cache_free).
> 
> When TREE_PREEMPT_RCU is enabled, the rcu list traversing above fails
> with access to 0x6b6b6b6b but it is fine with TREE_PREEMPT_RCU=n and
> TREE_RCU=y. During clean-up, kmemleak objects should no longer be freed
> by other means since kmemleak was disabled and all callbacks are
> ignored. The system is a 900Mhz P3, 256MB RAM, CONFIG_SMP=n.
> 
> Is there something I'm doing wrong in kmemleak or a bug with RCU
> preemption? The kernel oops looks like this:

>From your description and the code above, I must suspect a bug with
RCU preemption.  A new one, as the only bugs I am currently chasing
involve NR_CPUS>32 (>64 on 64-bit systems).

CONFIG_SMP=n implies NR_CPUS==1 in your build, correct?

							Thanx, Paul

> [ 5346.582119] kmemleak: Cannot allocate a kmemleak_object structure
> [ 5346.582208] Pid: 31302, comm: stress Not tainted 2.6.31-01335-g86d7101 #5
> [ 5346.582313] Call Trace:
> [ 5346.582414]  [<c01c4125>] create_object+0x215/0x220
> [ 5346.582529]  [<c0d3e660>] ? alloc_arch_preferred_bootmem+0x30/0x50
> [ 5346.582628]  [<c0157532>] ? mark_held_locks+0x52/0x70
> [ 5346.582734]  [<c0d3e660>] ? alloc_arch_preferred_bootmem+0x30/0x50
> [ 5346.582823]  [<c0d3e6b8>] ? __free+0x38/0x90
> [ 5346.582941]  [<c08ea9cb>] kmemleak_alloc+0x2b/0x60
> [ 5346.705312]  [<c01c075c>] kmem_cache_alloc+0x11c/0x1a0
> [ 5346.705453]  [<c05b7313>] ? cfq_set_request+0xf3/0x310
> [ 5346.705573]  [<c0d3e660>] ? alloc_arch_preferred_bootmem+0x30/0x50
> [ 5346.705660]  [<c05aeed3>] ? get_io_context+0x13/0x40
> [ 5346.705765]  [<c05b7220>] ? cfq_set_request+0x0/0x310
> [ 5346.705850]  [<c05b7313>] cfq_set_request+0xf3/0x310
> [ 5346.705968]  [<c015767c>] ? trace_hardirqs_on_caller+0x12c/0x180
> [ 5346.706133]  [<c05b7220>] ? cfq_set_request+0x0/0x310
> [ 5346.706230]  [<c05a3fcf>] elv_set_request+0x1f/0x50
> [ 5346.706342]  [<c05a8bbc>] get_request+0x27c/0x2f0
> [ 5346.706426]  [<c05a91c2>] get_request_wait+0xe2/0x140
> [ 5346.706545]  [<c0146290>] ? autoremove_wake_function+0x0/0x40
> [ 5346.706638]  [<c05abd79>] __make_request+0x89/0x3e0
> [ 5346.706744]  [<c05a7fe2>] generic_make_request+0x192/0x400
> [ 5346.706835]  [<c05ad011>] submit_bio+0x71/0x110
> [ 5346.706939]  [<c015767c>] ? trace_hardirqs_on_caller+0x12c/0x180
> [ 5346.797327]  [<c01576db>] ? trace_hardirqs_on+0xb/0x10
> [ 5346.797478]  [<c08fa239>] ? _spin_unlock_irqrestore+0x39/0x70
> [ 5346.797597]  [<c019d55d>] ? test_set_page_writeback+0x6d/0x140
> [ 5346.797699]  [<c01b607a>] swap_writepage+0x9a/0xd0
> [ 5346.797804]  [<c01b60b0>] ? end_swap_bio_write+0x0/0x80
> [ 5346.797895]  [<c01a0706>] shrink_page_list+0x316/0x700
> [ 5346.798003]  [<c015aa9f>] ? __lock_acquire+0x40f/0xab0
> [ 5346.798170]  [<c0159749>] ? validate_chain+0xe9/0x1030
> [ 5346.798260]  [<c01a0cca>] shrink_list+0x1da/0x4e0
> [ 5346.798370]  [<c01a1267>] shrink_zone+0x297/0x310
> [ 5346.798454]  [<c01a1441>] ? shrink_slab+0x161/0x1a0
> [ 5346.798563]  [<c01a1661>] try_to_free_pages+0x1e1/0x2e0
> [ 5346.798650]  [<c019f5f0>] ? isolate_pages_global+0x0/0x1e0
> [ 5346.798774]  [<c019b76e>] __alloc_pages_nodemask+0x35e/0x5d0
> [ 5346.798864]  [<c01aa957>] do_wp_page+0xb7/0x690
> [ 5346.798968]  [<c01abf83>] ? handle_mm_fault+0x263/0x600
> [ 5346.929240]  [<c08fa4b5>] ? _spin_lock+0x65/0x70
> [ 5346.929378]  [<c01ac185>] handle_mm_fault+0x465/0x600
> [ 5346.929496]  [<c08fc7fb>] ? do_page_fault+0x14b/0x390
> [ 5346.929589]  [<c014a4fc>] ? down_read_trylock+0x5c/0x70
> [ 5346.929696]  [<c08fc860>] do_page_fault+0x1b0/0x390
> [ 5346.929780]  [<c08fc6b0>] ? do_page_fault+0x0/0x390
> [ 5346.929884]  [<c08fad18>] error_code+0x70/0x78
> [ 5347.889442] BUG: unable to handle kernel paging request at 6b6b6b6b
> [ 5347.889626] IP: [<c01c31e0>] kmemleak_do_cleanup+0x60/0xa0
> [ 5347.889835] *pde = 00000000 
> [ 5347.889933] Oops: 0000 [#1] PREEMPT 
> [ 5347.890038] last sysfs file: /sys/class/vc/vcsa9/dev
> [ 5347.890038] Modules linked in: [last unloaded: rcutorture]
> [ 5347.890038] 
> [ 5347.890038] Pid: 5, comm: events/0 Not tainted (2.6.31-01335-g86d7101 #5)
> System Name
> [ 5347.890038] EIP: 0060:[<c01c31e0>] EFLAGS: 00010286 CPU: 0
> [ 5347.890038] EIP is at kmemleak_do_cleanup+0x60/0xa0
> [ 5347.890038] EAX: 002ed661 EBX: 6b6b6b43 ECX: 00000007 EDX: 6b6b6b6b
> [ 5347.890038] ESI: cf8b40b0 EDI: 00000002 EBP: cf8b8f3c ESP: cf8b8f28
> [ 5347.890038]  DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
> [ 5347.890038] Process events/0 (pid: 5, ti=cf8b8000 task=cf8c3500
> task.ti=cf8b8000)
> [ 5347.890038] Stack:
> [ 5347.890038]  00000002 00000001 00000000 c01c3180 c0cd6640 cf8b8f98 c0142857
> 00000000
> [ 5347.890038] <0> 00000002 00000000 c01427f6 cf8b40d4 cf8b40dc cf8c3500
> c01c3180 c0cd6640
> [ 5347.890038] <0> c0f938b0 c0a89514 00000000 00000000 00000000 cf8c3500
> c0146290 cf8b8f84
> [ 5347.890038] Call Trace:
> [ 5347.890038]  [<c01c3180>] ? kmemleak_do_cleanup+0x0/0xa0
> [ 5347.890038]  [<c0142857>] ? worker_thread+0x1d7/0x300
> [ 5347.890038]  [<c01427f6>] ? worker_thread+0x176/0x300
> [ 5347.890038]  [<c01c3180>] ? kmemleak_do_cleanup+0x0/0xa0
> [ 5347.890038]  [<c0146290>] ? autoremove_wake_function+0x0/0x40
> [ 5347.890038]  [<c0142680>] ? worker_thread+0x0/0x300
> [ 5347.890038]  [<c01461b7>] ? kthread+0x77/0x80
> [ 5347.890038]  [<c0146140>] ? kthread+0x0/0x80
> [ 5347.890038]  [<c010356b>] ? kernel_thread_helper+0x7/0x1c
> [ 5347.890038] Code: 89 44 24 04 b8 e0 2c cd c0 c7 04 24 02 00 00 00 e8 76 7f
> f9 ff 8b 15 d0 66 cd c0 eb 0b 8b 43 58 e8 76 ff ff ff 8b 53 28 8d 5a d8 <8b>
> 43 28 0f 18 00 90 81 fa d0 66 cd c0 75 e3 b9 ef 31 1c c0 ba 
> [ 5347.890038] EIP: [<c01c31e0>] kmemleak_do_cleanup+0x60/0xa0 SS:ESP
> 0068:cf8b8f28
> [ 5347.890038] CR2: 000000006b6b6b6b
> 
> 
> Thanks.
> 
> -- 
> Catalin
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/