lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 14 Nov 2022 19:07:26 +0800 From: "yebin (H)" <yebin10@...wei.com> To: Eric Whitney <enwlinux@...il.com> CC: Jan Kara <jack@...e.cz>, Ye Bin <yebin@...weicloud.com>, <tytso@....edu>, <adilger.kernel@...ger.ca>, <linux-ext4@...r.kernel.org>, <linux-kernel@...r.kernel.org>, <syzbot+05a0f0ccab4a25626e38@...kaller.appspotmail.com> Subject: Re: [PATCH] ext4: fix possible memory leak when enable bigalloc feature On 2022/11/8 9:28, Eric Whitney wrote: > * yebin (H) <yebin10@...wei.com>: >> >> On 2022/11/7 21:46, Jan Kara wrote: >>> Let me CC Eric who wrote this code... >>> >>> On Mon 07-11-22 09:54:15, Ye Bin wrote: >>>> From: Ye Bin <yebin10@...wei.com> >>>> >>>> Syzbot found the following issue: >>>> BUG: memory leak >>>> unreferenced object 0xffff8881bde17420 (size 32): >>>> comm "rep", pid 2327, jiffies 4295381963 (age 32.265s) >>>> hex dump (first 32 bytes): >>>> 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>>> backtrace: >>>> [<00000000ac6d38f8>] __insert_pending+0x13c/0x2d0 >>>> [<00000000d717de3b>] ext4_es_insert_delayed_block+0x399/0x4e0 >>>> [<000000004be03913>] ext4_da_map_blocks.constprop.0+0x739/0xfa0 >>>> [<00000000885a832a>] ext4_da_get_block_prep+0x10c/0x440 >>>> [<0000000029b7f8ef>] __block_write_begin_int+0x28d/0x860 >>>> [<00000000e182ebc3>] ext4_da_write_inline_data_begin+0x2d1/0xf30 >>>> [<00000000ced0c8a2>] ext4_da_write_begin+0x612/0x860 >>>> [<000000008d5f27fa>] generic_perform_write+0x215/0x4d0 >>>> [<00000000552c1cde>] ext4_buffered_write_iter+0x101/0x3b0 >>>> [<0000000052177ae8>] do_iter_readv_writev+0x19f/0x340 >>>> [<000000004b9de834>] do_iter_write+0x13b/0x650 >>>> [<00000000e2401b9b>] iter_file_splice_write+0x5a5/0xab0 >>>> [<0000000023aa5d90>] direct_splice_actor+0x103/0x1e0 >>>> [<0000000089e00fc1>] splice_direct_to_actor+0x2c9/0x7b0 >>>> [<000000004386851e>] do_splice_direct+0x159/0x280 >>>> [<00000000b567e609>] do_sendfile+0x932/0x1200 >>>> >>>> Now, 'ext4_clear_inode' don't cleanup pending tree which will lead to memory >>>> leak. >>>> To solve above issue, cleanup pending tree when clear inode. >>>> >>>> Reported-by: syzbot+05a0f0ccab4a25626e38@...kaller.appspotmail.com >>>> Signed-off-by: Ye Bin <yebin10@...wei.com> >>> So I'd think that by the time we are freeing inode all pending reservations >>> should be resolved and thus the tree should be empty. In that case you'd be >>> just masking some other bug where we failed to cleanup pending information >>> at the right moment. But maybe I'm missing something - that's why I've >>> added Eric to have a look ;) >>> >>> Honza >> Yes, this is really a circumvention plan. Maybe we can check here. If the >> pending tree is >> not empty, we still need to clean up resources to prevent memory leaks. >> Let me analyze this process again. > Jan is right. If there are pending reservations remaining by the time we > get to ext4_clear_inode(), something's broken somewhere else. The code is > designed to clean up any and all pending reservations when a file is truncated, > and that should happen in ext4_evict_inode() before ext4_clear_inode() is > called. (It's probably unnecessary as a result, but the call to > ext4_es_remove_extent() in ext4_clear_inode() should free any stray pending > reservations via __es_remove_extent() and get_rsvd() unless they're somehow not > consistent with the extents in the status tree.) > > If there are leaking pending reservations, it may be that the cluster > accounting isn't working correctly. So, the better thing to do is to find > the root cause of the leak and fix it at its source. > > I can guess what the general cause of the breakage might be. The presence of > ext4_da_write_inline_data_begin() on the stack suggests that the inline_data > option is being used with bigalloc here. If so, that combination is unlikely > to work well. To my knowledge, the cluster accounting code has not yet been > deliberately integrated with or well tested with inline. > > Eric > I find this issue fixed by my previous patch 1b8f787ef547230a3249bcf897221ef0cc78481b “ext4: fix warning in 'ext4_da_release_space'“ . So above issue occured by migrate. >>>> --- >>>> fs/ext4/extents_status.c | 22 ++++++++++++++++++++++ >>>> fs/ext4/extents_status.h | 1 + >>>> fs/ext4/super.c | 1 + >>>> 3 files changed, 24 insertions(+) >>>> >>>> diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c >>>> index cd0a861853e3..5f6b218464de 100644 >>>> --- a/fs/ext4/extents_status.c >>>> +++ b/fs/ext4/extents_status.c >>>> @@ -1947,6 +1947,28 @@ void ext4_remove_pending(struct inode *inode, ext4_lblk_t lblk) >>>> write_unlock(&ei->i_es_lock); >>>> } >>>> +void ext4_clear_inode_pending(struct inode *inode) >>>> +{ >>>> + struct ext4_inode_info *ei = EXT4_I(inode); >>>> + struct pending_reservation *pr; >>>> + struct ext4_pending_tree *tree; >>>> + struct rb_node *node; >>>> + >>>> + if (EXT4_SB(inode->i_sb)->s_cluster_ratio == 1) >>>> + return; >>>> + >>>> + write_lock(&ei->i_es_lock); >>>> + tree = &EXT4_I(inode)->i_pending_tree; >>>> + node = rb_first(&tree->root); >>>> + while (node) { >>>> + pr = rb_entry(node, struct pending_reservation, rb_node); >>>> + node = rb_next(node); >>>> + rb_erase(&pr->rb_node, &tree->root); >>>> + kmem_cache_free(ext4_pending_cachep, pr); >>>> + } >>>> + write_unlock(&ei->i_es_lock); >>>> +} >>>> + >>>> /* >>>> * ext4_is_pending - determine whether a cluster has a pending reservation >>>> * on it >>>> diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h >>>> index 4ec30a798260..25b605309c06 100644 >>>> --- a/fs/ext4/extents_status.h >>>> +++ b/fs/ext4/extents_status.h >>>> @@ -248,6 +248,7 @@ extern int __init ext4_init_pending(void); >>>> extern void ext4_exit_pending(void); >>>> extern void ext4_init_pending_tree(struct ext4_pending_tree *tree); >>>> extern void ext4_remove_pending(struct inode *inode, ext4_lblk_t lblk); >>>> +extern void ext4_clear_inode_pending(struct inode *inode); >>>> extern bool ext4_is_pending(struct inode *inode, ext4_lblk_t lblk); >>>> extern int ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk, >>>> bool allocated); >>>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >>>> index 106fb06e24e8..160667dcf09a 100644 >>>> --- a/fs/ext4/super.c >>>> +++ b/fs/ext4/super.c >>>> @@ -1434,6 +1434,7 @@ void ext4_clear_inode(struct inode *inode) >>>> clear_inode(inode); >>>> ext4_discard_preallocations(inode, 0); >>>> ext4_es_remove_extent(inode, 0, EXT_MAX_BLOCKS); >>>> + ext4_clear_inode_pending(inode); >>>> dquot_drop(inode); >>>> if (EXT4_I(inode)->jinode) { >>>> jbd2_journal_release_jbd_inode(EXT4_JOURNAL(inode), >>>> -- >>>> 2.31.1 >>>> > . >
Powered by blists - more mailing lists