[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <74139de7-4194-46b9-aeea-702bdd5b432a@huawei.com>
Date: Wed, 26 Mar 2025 17:26:19 +0800
From: Baokun Li <libaokun1@...wei.com>
To: Ojaswin Mujoo <ojaswin@...ux.ibm.com>, Zhang Yi <yi.zhang@...weicloud.com>
CC: <linux-ext4@...r.kernel.org>, Jan Kara <jack@...e.cz>, Ritesh Harjani
<ritesh.list@...il.com>, <linux-kernel@...r.kernel.org>, Yang Erkun
<yangerkun@...wei.com>, Theodore Ts'o <tytso@....edu>
Subject: Re: [PATCH] ext4: cache es->s_journal_inum in ext4_sb_info
On 2025/3/26 16:33, Zhang Yi wrote:
> On 2025/3/26 14:39, Ojaswin Mujoo wrote:
>> On Wed, Mar 26, 2025 at 12:01:45PM +0800, Zhang Yi wrote:
>>> On 2025/3/26 10:16, Baokun Li wrote:
>>>> On 2025/3/26 1:57, Ojaswin Mujoo wrote:
>>>>> On Tue, Mar 18, 2025 at 10:31:29PM -0400, Theodore Ts'o wrote:
>>>>>> On Tue, Mar 18, 2025 at 01:42:31PM +0530, Ojaswin Mujoo wrote:
>>>>>>>> So this is something we need to do if the journal is actived, and if
>>>>>>>> it's active, then sbi->s_journal will be non-NULL, and so we can just
>>>>>>>> check to see if inode == sbi->s_journal instead. This will simplify
>>>>>>> I believe you mean inode == sbi->s_journal->j_inode here right?
>>>>>> Yes, that's what I meant; sorry for the not catching this before I
>>>>>> sent my reply.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> - Ted
>>>>> Hi Ted, Baokun,
>>>>>
>>>>> I got some time to revisit this. Seems like checking against
>>>>> s_journal->j_inode is not enough. This is because both
>>>>> ext4_check_blockref() and check_block_validity() can be called even
>>>>> before journal->j_inode is set:
>>>>>
>>>>> ext4_open_inode_journal
>>>>> ext4_get_journal_inode
>>>>> __ext4_iget
>>>>> ext4_ind_check_inode
>>>>> ext4_check_blockref /* j_inode not set */
>>>>>
>>>>> journal = jbd2_journal_init_inode
>>>>> bmap
>>>>> ext4_bmap
>>>>> iomap_bmap
>>>>> ext4_iomap_begin
>>>>> ext4_map_blocks
>>>>> check_block_validity
>>>>>
>>>>> journal->j_inode = inode
>>>>>
>>>>>
>>>>> Now, I think in this case the best solution might be to use the extra
>>>>> field like we do in this patch but set EXT4_SB(sb)->s_journal_ino
>>>>> sufficiently early.
>>>>>
>>>>> Thoughts?
>>>> Because system zone setup happens after the journal are loaded, I think we
>>>> can skip the check if the journal haven't been loaded yet, like this:
>>>>
>>>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>>>> index d04d8a7f12e7..38dc72ff7e78 100644
>>>> --- a/fs/ext4/inode.c
>>>> +++ b/fs/ext4/inode.c
>>>> @@ -383,9 +383,10 @@ static int __check_block_validity(struct inode *inode, const char *func,
>>>> unsigned int line,
>>>> struct ext4_map_blocks *map)
>>>> {
>>>> + journal_t *journal = EXT4_SB(inode->i_sb)->s_journal;
>>>> +
>>>> if (ext4_has_feature_journal(inode->i_sb) &&
>>>> - (inode->i_ino ==
>>>> - le32_to_cpu(EXT4_SB(inode->i_sb)->s_es->s_journal_inum)))
>>>> + (!journal || inode == journal->j_inode))
>>>> return 0;
>>>> if (!ext4_inode_block_valid(inode, map->m_pblk, map->m_len)) {
>>>> ext4_error_inode(inode, func, line, map->m_pblk,
>>>>
>>>> If any part of the journal area overlaps with the system zone, we'll catch
>>>> it when we add the journal area to the system zone later.
>>>>
>>>>
>>> Since the creation of the system zone relies on the journal being
>>> loaded, I think there is no risk in proceeding to call
>>> ext4_inode_block_valid() to perform a basic block range check for
>>> the journal inode, or even better.
Indeed, performing some basic anomaly checks in advance can prevent
journal replay from worsening the situation in abnormal cases. Moreover,
since s_journal is NULL at this point, we won't schedule s_sb_upd_work
even if the check fails, which is safe.
>>>
>>> Thanks,
>>> Yi.
>> Got it Yi, makes sense to me. So I believe you are suggesting something
>> like:
>>
>> @@ -384,9 +384,10 @@ static int __check_block_validity(struct inode *inode, const char *func,
>> unsigned int line,
>> struct ext4_map_blocks *map)
>> {
>> + journal_t *journal = EXT4_SB(inode->i_sb)->s_journal;
>> +
>> if (ext4_has_feature_journal(inode->i_sb) &&
> We are going to check ->s_journal, so I suppose we could drop this
> feature check as well. Others looks good to me.
Seconded.
>
>> - (inode->i_ino ==
>> - le32_to_cpu(EXT4_SB(inode->i_sb)->s_es->s_journal_inum)))
>> + (journal && journal->j_inode == inode))
>> return 0;
>> if (!ext4_inode_block_valid(inode, map->m_pblk, map->m_len)) {
>> ext4_error_inode(inode, func, line, map->m_pblk,
>>
>> So that even if it is a journal inode we can go ahead and perform some basic checks
>> as the system zone rbtree will anyways be NULL at this point. From a cursory look,
>> it seems that __ext4_iget(..., journal_inode) -> ext4_ext_check_inode() already relies
>> on the fact that system zone is NULL, so we should be okay here as well.
> Yeah, that's right. :)
>
> Cheers,
> Yi.
>
>> If this looks good, I'll send a v2 with the suggested changes.
>>
>> Thanks,
>> ojaswin
Please mention in the commit message that we're now doing some basic
checks on the journal area.
Cheers,
Baokun
Powered by blists - more mailing lists