lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 16 May 2017 14:39:27 +0100 From: Catalin Marinas <catalin.marinas@....com> To: Michal Hocko <mhocko@...nel.org> Cc: "Luis R. Rodriguez" <mcgrof@...nel.org>, Andy Lutomirski <luto@...capital.net>, Andrew Morton <akpm@...ux-foundation.org>, Ingo Molnar <mingo@...nel.org>, Andy Lutomirski <luto@...nel.org>, Kees Cook <keescook@...omium.org>, "Eric W. Biederman" <ebiederm@...ssion.com>, Mateusz Guzik <mguzik@...hat.com>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org> Subject: Re: kmemleak splat on copy_process() Thanks for cc'ing me. The vmalloc allocations have always been tricky for kmemleak since there are 2-3 other memory locations with the same value as the vmalloc'ed object: vm_struct.addr and vmap_area.va_start; occasionally we have vmap_area.va_end pointing to the next vmap_area.va_start. To have a more reliable object reference counting, kmemleak avoids scanning most of the vmap_area structure (apart from vmap_area.vm) and requires a minimum count of 2 references for a vmalloc'ed object to not be considered a leak (that is vm_struct.addr and the caller's location, in this case task->stack). > On Mon 15-05-17 23:53:18, Luis R. Rodriguez wrote: > > root@...gy:~# cat /sys/kernel/debug/kmemleak > > unreferenced object 0xffffa07500d4c000 (size 16384): > > comm "bash", pid 1349, jiffies 4294895999 (age 263.204s) > > hex dump (first 32 bytes): > > 9d 6e ac 57 00 00 00 00 00 00 00 00 00 00 00 00 .n.W............ > > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > > backtrace: > > [<ffffffffa5464cca>] kmemleak_alloc+0x4a/0xa0 > > [<ffffffffa4fdfe6c>] __vmalloc_node_range+0x20c/0x2b0 > > [<ffffffffa4e7d1a2>] copy_process.part.37+0x5c2/0x1af0 > > [<ffffffffa4e7e89f>] _do_fork+0xcf/0x390 > > [<ffffffffa4e7ec09>] SyS_clone+0x19/0x20 > > [<ffffffffa4e03b0b>] do_syscall_64+0x5b/0xc0 > > [<ffffffffa547072b>] return_from_SYSCALL_64+0x0/0x6a > > [<ffffffffffffffff>] 0xffffffffffffffff This stack here was allocated by bash for a child process via alloc_thread_stack_node() when CONFIG_VMAP_STACK is enabled. When the child died, free_stack_thread() did not call vfree_atomic() but rather stored the corresponding vm_struct pointer in cached_stacks[i]. However, the original task->stack pointer was lost (child task_struct freed), so when scanning the memory kmemleak can only find a single reference to the original stack (via vm_struct cached_stacks[i]) rather than 2 as required for vmalloc'ed objects. BTW, you can get more info about a specific object, including the number of references found, with: echo dump=0xffffa07500d4c000 > /sys/kernel/debug/kmemleak So basically kernel/fork.c has its own thread stack allocator on top of vmalloc and kmemleak gets confused. A quick workaround: ------------8<------------------------------ diff --git a/kernel/fork.c b/kernel/fork.c index 06d759ab4c62..785907fccf67 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -214,6 +214,7 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node) this_cpu_write(cached_stacks[i], NULL); tsk->stack_vm_area = s; + kmemleak_alloc(s->addr, THREAD_SIZE, 2, GFP_ATOMIC); local_irq_enable(); return s->addr; } @@ -254,6 +255,7 @@ static inline void free_thread_stack(struct task_struct *tsk) continue; this_cpu_write(cached_stacks[i], tsk->stack_vm_area); + kmemleak_free(tsk->stack); local_irq_restore(flags); return; } ------------8<------------------------------ A better solution I think would be something like kmemleak_ignore() on the free path paired with a new kmemleak_unignore() in order to avoid the freeing/re-allocation of the kmemleak metadata. A more complex solution (which I'm not yet convinced it works) may be to pass the number of references of an objects down to the objects it refers. In the vmalloc case we normally have a single reference to vm_struct and two to the vmalloc'ed object. If a reference to the vmalloc'ed object disappeared we could balance it with an increased number of references to the vm_struct object. But this option requires more research. Unless there are better ideas, I'll post a patch adding kmemleak_ignore/unignore annotations to kerne/fork.c (and the corresponding mm/kmemleak.c changes). Thanks. -- Catalin
Powered by blists - more mailing lists