linux-kernel - Re: [PATCH v4] debugobjects: scale the static pool size

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <211af3b2-bc56-2d1b-c6c2-f6853797a7a1@gmx.us>
Date:   Sun, 25 Nov 2018 15:42:12 -0500
From:   Qian Cai <cai@....us>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Waiman Long <longman@...hat.com>,
        Yang Shi <yang.shi@...ux.alibaba.com>, arnd@...db.de,
        linux kernel <linux-kernel@...r.kernel.org>,
        Catalin Marinas <catalin.marinas@....com>
Subject: Re: [PATCH v4] debugobjects: scale the static pool size



On 11/23/18 10:01 PM, Qian Cai wrote:
> 
> 
>> On Nov 22, 2018, at 4:56 PM, Thomas Gleixner <tglx@...utronix.de> wrote:
>>
>> On Tue, 20 Nov 2018, Qian Cai wrote:
>>
>> Looking deeper at that.
>>
>>> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
>>> index 70935ed91125..140571aa483c 100644
>>> --- a/lib/debugobjects.c
>>> +++ b/lib/debugobjects.c
>>> @@ -23,9 +23,81 @@
>>> #define ODEBUG_HASH_BITS	14
>>> #define ODEBUG_HASH_SIZE	(1 << ODEBUG_HASH_BITS)
>>>
>>> -#define ODEBUG_POOL_SIZE	1024
>>> +#define ODEBUG_DEFAULT_POOL	512
>>> #define ODEBUG_POOL_MIN_LEVEL	256
>>>
>>> +/*
>>> + * Some debug objects are allocated during the early boot. Enabling some options
>>> + * like timers or workqueue objects may increase the size required significantly
>>> + * with large number of CPUs. For example (as today, 20 Nov. 2018),
>>> + *
>>> + * No. CPUs x 2 (worker pool) objects:
>>> + *
>>> + * start_kernel
>>> + *   workqueue_init_early
>>> + *     init_worker_pool
>>> + *       init_timer_key
>>> + *         debug_object_init
>>> + *
>>> + * No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
>>> + *
>>> + * sched_init
>>> + *   hrtick_rq_init
>>> + *     hrtimer_init
>>> + *
>>> + * CONFIG_DEBUG_OBJECTS_WORK:
>>> + * No. CPUs x 6 (workqueue) objects:
>>> + *
>>> + * workqueue_init_early
>>> + *   alloc_workqueue
>>> + *     __alloc_workqueue_key
>>> + *       alloc_and_link_pwqs
>>> + *         init_pwq
>>> + *
>>> + * Also, plus No. CPUs objects:
>>> + *
>>> + * perf_event_init
>>> + *    __init_srcu_struct
>>> + *      init_srcu_struct_fields
>>> + *        init_srcu_struct_nodes
>>> + *          __init_work
>>
>> None of the things are actually used or required _BEFORE_
>> debug_objects_mem_init() is invoked.
>>
>> The reason why the call is at this place in start_kernel() is
>> historical. It's because back in the days when debugobjects were added the
>> memory allocator was enabled way later than today. So we can just move the
>> debug_objects_mem_init() call right before sched_init() I think.
> 
> Well, now that kmemleak_init() seems complains that debug_objects_mem_init()
> is called before it.
> 
> [    0.078805] kmemleak: Cannot insert 0xc000000dff930000 into the object search tree (overlaps existing)
> [    0.078860] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.20.0-rc3+ #3
> [    0.078883] Call Trace:
> [    0.078904] [c000000001c8fcd0] [c000000000c96b34] dump_stack+0xe8/0x164 (unreliable)
> [    0.078935] [c000000001c8fd20] [c000000000486e84] create_object+0x344/0x380
> [    0.078962] [c000000001c8fde0] [c000000000489544] early_alloc+0x108/0x1f8
> [    0.078989] [c000000001c8fe20] [c00000000109738c] kmemleak_init+0x1d8/0x3d4
> [    0.079016] [c000000001c8ff00] [c000000001054028] start_kernel+0x5c0/0x6f8
> [    0.079043] [c000000001c8ff90] [c00000000000ae7c] start_here_common+0x1c/0x520
> [    0.079070] kmemleak: Kernel memory leak detector disabled
> [    0.079091] kmemleak: Object 0xc000000ffd587b68 (size 40):
> [    0.079112] kmemleak:   comm "swapper/0", pid 0, jiffies 4294937299
> [    0.079135] kmemleak:   min_count = -1
> [    0.079153] kmemleak:   count = 0
> [    0.079170] kmemleak:   flags = 0x5
> [    0.079188] kmemleak:   checksum = 0
> [    0.079206] kmemleak:   backtrace:
> [    0.079227]      __debug_object_init+0x688/0x700
> [    0.079250]      debug_object_activate+0x1e0/0x350
> [    0.079272]      __call_rcu+0x60/0x430
> [    0.079292]      put_object+0x60/0x80
> [    0.079311]      kmemleak_init+0x2cc/0x3d4
> [    0.079331]      start_kernel+0x5c0/0x6f8
> [    0.079351]      start_here_common+0x1c/0x520
> [    0.079380] kmemleak: Early log backtrace:
> [    0.079399]    memblock_alloc_try_nid_raw+0x90/0xcc
> [    0.079421]    sparse_init_nid+0x144/0x51c
> [    0.079440]    sparse_init+0x1a0/0x238
> [    0.079459]    initmem_init+0x1d8/0x25c
> [    0.079498]    setup_arch+0x3e0/0x464
> [    0.079517]    start_kernel+0xa4/0x6f8
> [    0.079536]    start_here_common+0x1c/0x520
> 

So this is an chicken-egg problem. Debug objects need kmemleak_init() first, so 
it can make use of kmemleak_ignore() for all debug objects in order to avoid the 
overlapping like the above.

while (obj_pool_free < debug_objects_pool_min_level) {

	new = kmem_cache_zalloc(obj_cache, gfp);
	if (!new)
		return;

	kmemleak_ignore(new);

However, there seems no way to move kmemleak_init() together this early in 
start_kernel() just before vmalloc_init() [1] because it looks like it depends 
on things like workqueue (schedule_work(&cleanup_work)) and rcu. Hence, it needs 
to be after workqueue_init_early() and rcu_init()

Given that, maybe the best outcome is to stick to the alternative approach that 
works [1] rather messing up with the order of debug_objects_mem_init() in 
start_kernel() which seems tricky. What do you think?

[1] https://goo.gl/18N78g
[2] https://goo.gl/My6ig6