linux-ext4 - Re: [PATCH] ext4: fix possible memory leak when enable bigalloc feature

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6372216E.4010604@huawei.com>
Date:   Mon, 14 Nov 2022 19:07:26 +0800
From:   "yebin (H)" <yebin10@...wei.com>
To:     Eric Whitney <enwlinux@...il.com>
CC:     Jan Kara <jack@...e.cz>, Ye Bin <yebin@...weicloud.com>,
        <tytso@....edu>, <adilger.kernel@...ger.ca>,
        <linux-ext4@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <syzbot+05a0f0ccab4a25626e38@...kaller.appspotmail.com>
Subject: Re: [PATCH] ext4: fix possible memory leak when enable bigalloc
 feature



On 2022/11/8 9:28, Eric Whitney wrote:
> * yebin (H) <yebin10@...wei.com>:
>>
>> On 2022/11/7 21:46, Jan Kara wrote:
>>> Let me CC Eric who wrote this code...
>>>
>>> On Mon 07-11-22 09:54:15, Ye Bin wrote:
>>>> From: Ye Bin <yebin10@...wei.com>
>>>>
>>>> Syzbot found the following issue:
>>>> BUG: memory leak
>>>> unreferenced object 0xffff8881bde17420 (size 32):
>>>>     comm "rep", pid 2327, jiffies 4295381963 (age 32.265s)
>>>>     hex dump (first 32 bytes):
>>>>       01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>>>       00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>>>>     backtrace:
>>>>       [<00000000ac6d38f8>] __insert_pending+0x13c/0x2d0
>>>>       [<00000000d717de3b>] ext4_es_insert_delayed_block+0x399/0x4e0
>>>>       [<000000004be03913>] ext4_da_map_blocks.constprop.0+0x739/0xfa0
>>>>       [<00000000885a832a>] ext4_da_get_block_prep+0x10c/0x440
>>>>       [<0000000029b7f8ef>] __block_write_begin_int+0x28d/0x860
>>>>       [<00000000e182ebc3>] ext4_da_write_inline_data_begin+0x2d1/0xf30
>>>>       [<00000000ced0c8a2>] ext4_da_write_begin+0x612/0x860
>>>>       [<000000008d5f27fa>] generic_perform_write+0x215/0x4d0
>>>>       [<00000000552c1cde>] ext4_buffered_write_iter+0x101/0x3b0
>>>>       [<0000000052177ae8>] do_iter_readv_writev+0x19f/0x340
>>>>       [<000000004b9de834>] do_iter_write+0x13b/0x650
>>>>       [<00000000e2401b9b>] iter_file_splice_write+0x5a5/0xab0
>>>>       [<0000000023aa5d90>] direct_splice_actor+0x103/0x1e0
>>>>       [<0000000089e00fc1>] splice_direct_to_actor+0x2c9/0x7b0
>>>>       [<000000004386851e>] do_splice_direct+0x159/0x280
>>>>       [<00000000b567e609>] do_sendfile+0x932/0x1200
>>>>
>>>> Now, 'ext4_clear_inode' don't cleanup pending tree which will lead to memory
>>>> leak.
>>>> To solve above issue, cleanup pending tree when clear inode.
>>>>
>>>> Reported-by: syzbot+05a0f0ccab4a25626e38@...kaller.appspotmail.com
>>>> Signed-off-by: Ye Bin <yebin10@...wei.com>
>>> So I'd think that by the time we are freeing inode all pending reservations
>>> should be resolved and thus the tree should be empty. In that case you'd be
>>> just masking some other bug where we failed to cleanup pending information
>>> at the right moment. But maybe I'm missing something - that's why I've
>>> added Eric to have a look ;)
>>>
>>> 								Honza
>> Yes, this is really a circumvention plan. Maybe we can check here. If the
>> pending tree is
>> not empty, we still need to clean up resources to prevent memory leaks.
>> Let me analyze this process again.
> Jan is right.  If there are pending reservations remaining by the time we
> get to ext4_clear_inode(), something's broken somewhere else.  The code is
> designed to clean up any and all pending reservations when a file is truncated,
> and that should happen in ext4_evict_inode() before ext4_clear_inode() is
> called.  (It's probably unnecessary as a result, but the call to
> ext4_es_remove_extent() in ext4_clear_inode() should free any stray pending
> reservations via __es_remove_extent() and get_rsvd() unless they're somehow not
> consistent with the extents in the status tree.)
>
> If there are leaking pending reservations, it may be that the cluster
> accounting isn't working correctly.  So, the better thing to do is to find
> the root cause of the leak and fix it at its source.
>
> I can guess what the general cause of the breakage might be.  The presence of
> ext4_da_write_inline_data_begin() on the stack suggests that the inline_data
> option is being used with bigalloc here.  If so, that combination is unlikely
> to work well.  To my knowledge, the cluster accounting code has not yet been
> deliberately integrated with or well tested with inline.
>
> Eric
>
I find this issue fixed by my previous patch 
1b8f787ef547230a3249bcf897221ef0cc78481b
“ext4: fix warning in 'ext4_da_release_space'“ .  So above issue occured 
by migrate.
>>>> ---
>>>>    fs/ext4/extents_status.c | 22 ++++++++++++++++++++++
>>>>    fs/ext4/extents_status.h |  1 +
>>>>    fs/ext4/super.c          |  1 +
>>>>    3 files changed, 24 insertions(+)
>>>>
>>>> diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
>>>> index cd0a861853e3..5f6b218464de 100644
>>>> --- a/fs/ext4/extents_status.c
>>>> +++ b/fs/ext4/extents_status.c
>>>> @@ -1947,6 +1947,28 @@ void ext4_remove_pending(struct inode *inode, ext4_lblk_t lblk)
>>>>    	write_unlock(&ei->i_es_lock);
>>>>    }
>>>> +void ext4_clear_inode_pending(struct inode *inode)
>>>> +{
>>>> +	struct ext4_inode_info *ei = EXT4_I(inode);
>>>> +	struct pending_reservation *pr;
>>>> +	struct ext4_pending_tree *tree;
>>>> +	struct rb_node *node;
>>>> +
>>>> +	if (EXT4_SB(inode->i_sb)->s_cluster_ratio == 1)
>>>> +		return;
>>>> +
>>>> +	write_lock(&ei->i_es_lock);
>>>> +	tree = &EXT4_I(inode)->i_pending_tree;
>>>> +	node = rb_first(&tree->root);
>>>> +	while (node) {
>>>> +		pr = rb_entry(node, struct pending_reservation, rb_node);
>>>> +		node = rb_next(node);
>>>> +		rb_erase(&pr->rb_node, &tree->root);
>>>> +		kmem_cache_free(ext4_pending_cachep, pr);
>>>> +	}
>>>> +	write_unlock(&ei->i_es_lock);
>>>> +}
>>>> +
>>>>    /*
>>>>     * ext4_is_pending - determine whether a cluster has a pending reservation
>>>>     *                   on it
>>>> diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h
>>>> index 4ec30a798260..25b605309c06 100644
>>>> --- a/fs/ext4/extents_status.h
>>>> +++ b/fs/ext4/extents_status.h
>>>> @@ -248,6 +248,7 @@ extern int __init ext4_init_pending(void);
>>>>    extern void ext4_exit_pending(void);
>>>>    extern void ext4_init_pending_tree(struct ext4_pending_tree *tree);
>>>>    extern void ext4_remove_pending(struct inode *inode, ext4_lblk_t lblk);
>>>> +extern void ext4_clear_inode_pending(struct inode *inode);
>>>>    extern bool ext4_is_pending(struct inode *inode, ext4_lblk_t lblk);
>>>>    extern int ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk,
>>>>    					bool allocated);
>>>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
>>>> index 106fb06e24e8..160667dcf09a 100644
>>>> --- a/fs/ext4/super.c
>>>> +++ b/fs/ext4/super.c
>>>> @@ -1434,6 +1434,7 @@ void ext4_clear_inode(struct inode *inode)
>>>>    	clear_inode(inode);
>>>>    	ext4_discard_preallocations(inode, 0);
>>>>    	ext4_es_remove_extent(inode, 0, EXT_MAX_BLOCKS);
>>>> +	ext4_clear_inode_pending(inode);
>>>>    	dquot_drop(inode);
>>>>    	if (EXT4_I(inode)->jinode) {
>>>>    		jbd2_journal_release_jbd_inode(EXT4_JOURNAL(inode),
>>>> -- 
>>>> 2.31.1
>>>>
> .
>