linux-kernel - Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when orig

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <49ef066d-d001-411e-8db7-f064bdc2104c@suse.cz>
Date: Fri, 4 Oct 2024 11:18:13 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Marco Elver <elver@...gle.com>
Cc: Feng Tang <feng.tang@...el.com>, Andrew Morton
 <akpm@...ux-foundation.org>, Christoph Lameter <cl@...ux.com>,
 Pekka Enberg <penberg@...nel.org>, David Rientjes <rientjes@...gle.com>,
 Joonsoo Kim <iamjoonsoo.kim@....com>,
 Roman Gushchin <roman.gushchin@...ux.dev>,
 Hyeonggon Yoo <42.hyeyoo@...il.com>, Andrey Konovalov
 <andreyknvl@...il.com>, Shuah Khan <skhan@...uxfoundation.org>,
 David Gow <davidgow@...gle.com>, Danilo Krummrich <dakr@...nel.org>,
 Alexander Potapenko <glider@...gle.com>,
 Andrey Ryabinin <ryabinin.a.a@...il.com>, Dmitry Vyukov
 <dvyukov@...gle.com>, Vincenzo Frascino <vincenzo.frascino@....com>,
 linux-mm@...ck.org, kasan-dev@...glegroups.com,
 linux-kernel@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>
Subject: Re: [PATCH v2 0/5] mm/slub: Improve data handling of krealloc() when
 orig_size is enabled

On 10/4/24 08:44, Marco Elver wrote:
> On Wed, 2 Oct 2024 at 12:42, Vlastimil Babka <vbabka@...e.cz> wrote:
>>
>> On 9/11/24 08:45, Feng Tang wrote:
>> > Danilo Krummrich's patch [1] raised one problem about krealloc() that
>> > its caller doesn't pass the old request size, say the object is 64
>> > bytes kmalloc one, but caller originally only requested 48 bytes. Then
>> > when krealloc() shrinks or grows in the same object, or allocate a new
>> > bigger object, it lacks this 'original size' information to do accurate
>> > data preserving or zeroing (when __GFP_ZERO is set).
>> >
>> > Thus with slub debug redzone and object tracking enabled, parts of the
>> > object after krealloc() might contain redzone data instead of zeroes,
>> > which is violating the __GFP_ZERO guarantees. Good thing is in this
>> > case, kmalloc caches do have this 'orig_size' feature, which could be
>> > used to improve the situation here.
>> >
>> > To make the 'orig_size' accurate, we adjust some kasan/slub meta data
>> > handling. Also add a slub kunit test case for krealloc().
>> >
>> > This patchset has dependency over patches in both -mm tree and -slab
>> > trees, so it is written based on linux-next tree '20240910' version.
>> >
>> > [1]. https://lore.kernel.org/lkml/20240812223707.32049-1-dakr@kernel.org/
>>
>> Thanks, added to slab/for-next
> 
> This series just hit -next, and we're seeing several "KFENCE: memory
> corruption ...". Here's one:
> https://lore.kernel.org/all/66ff8bf6.050a0220.49194.0453.GAE@google.com/
> 
> One more (no link):
> 
>> ==================================================================
>> BUG: KFENCE: memory corruption in xfs_iext_destroy_node+0xab/0x670 fs/xfs/libxfs/xfs_iext_tree.c:1051
>>
>> Corrupted memory at 0xffff88823bf5a0d0 [ 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 ] (in kfence-#172):
>> xfs_iext_destroy_node+0xab/0x670 fs/xfs/libxfs/xfs_iext_tree.c:1051
>> xfs_iext_destroy+0x66/0x100 fs/xfs/libxfs/xfs_iext_tree.c:1062
>> xfs_inode_free_callback+0x91/0x1d0 fs/xfs/xfs_icache.c:145
>> rcu_do_batch kernel/rcu/tree.c:2567 [inline]
> [...]
>>
>> kfence-#172: 0xffff88823bf5a000-0xffff88823bf5a0cf, size=208, cache=kmalloc-256
>>
>> allocated by task 5494 on cpu 0 at 101.266046s (0.409225s ago):
>> __do_krealloc mm/slub.c:4784 [inline]
>> krealloc_noprof+0xd6/0x2e0 mm/slub.c:4838
>> xfs_iext_realloc_root fs/xfs/libxfs/xfs_iext_tree.c:613 [inline]
> [...]
>>
>> freed by task 16 on cpu 0 at 101.573936s (0.186416s ago):
>> xfs_iext_destroy_node+0xab/0x670 fs/xfs/libxfs/xfs_iext_tree.c:1051
>> xfs_iext_destroy+0x66/0x100 fs/xfs/libxfs/xfs_iext_tree.c:1062
>> xfs_inode_free_callback+0x91/0x1d0 fs/xfs/xfs_icache.c:145
> [...]
>>
>> CPU: 0 UID: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.12.0-rc1-next-20241003-syzkaller #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
>> ==================================================================
> 
> Unfortunately there's no reproducer yet it seems. Unless it's
> immediately obvious to say what's wrong, is it possible to take this
> series out of -next to confirm this series is causing the memory
> corruptions? Syzbot should then stop finding these crashes.

I think it's commit d0a38fad51cc7 doing in __do_krealloc()

-               ks = ksize(p);
+
+               s = virt_to_cache(p);
+               orig_size = get_orig_size(s, (void *)p);
+               ks = s->object_size;

so for kfence objects we don't get their actual allocation size but the
potentially larger bucket size?

I guess we could do:

ks = kfence_ksize(p) ?: s->object_size;

?

> Thanks,
> -- Marco