[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bfd1b9f3-7f0e-4b3c-9399-4d697be37a9e@huaweicloud.com>
Date: Sat, 17 Jun 2023 10:42:59 +0800
From: Zhang Yi <yi.zhang@...weicloud.com>
To: Jan Kara <jack@...e.cz>
Cc: linux-ext4@...r.kernel.org, tytso@....edu,
adilger.kernel@...ger.ca, yi.zhang@...wei.com,
chengzhihao1@...wei.com, yukuai3@...wei.com
Subject: Re: [PATCH v2] jbd2: skip reading super block if it has been verified
On 2023/6/16 21:27, Jan Kara wrote:
> On Fri 16-06-23 09:55:47, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@...wei.com>
>>
>> We got a NULL pointer dereference issue below while running generic/475
>> I/O failure pressure test.
>>
>> BUG: kernel NULL pointer dereference, address: 0000000000000000
>> #PF: supervisor write access in kernel mode
>> #PF: error_code(0x0002) - not-present page
>> PGD 0 P4D 0
>> Oops: 0002 [#1] PREEMPT SMP PTI
>> CPU: 1 PID: 15600 Comm: fsstress Not tainted 6.4.0-rc5-xfstests-00055-gd3ab1bca26b4 #190
>> RIP: 0010:jbd2_journal_set_features+0x13d/0x430
>> ...
>> Call Trace:
>> <TASK>
>> ? __die+0x23/0x60
>> ? page_fault_oops+0xa4/0x170
>> ? exc_page_fault+0x67/0x170
>> ? asm_exc_page_fault+0x26/0x30
>> ? jbd2_journal_set_features+0x13d/0x430
>> jbd2_journal_revoke+0x47/0x1e0
>> __ext4_forget+0xc3/0x1b0
>> ext4_free_blocks+0x214/0x2f0
>> ext4_free_branches+0xeb/0x270
>> ext4_ind_truncate+0x2bf/0x320
>> ext4_truncate+0x1e4/0x490
>> ext4_handle_inode_extension+0x1bd/0x2a0
>> ? iomap_dio_complete+0xaf/0x1d0
>>
>> The root cause is the journal super block had been failed to write out
>> due to I/O fault injection, it's uptodate bit was cleared by
>> end_buffer_write_sync() and didn't reset yet in jbd2_write_superblock().
>> And it raced by journal_get_superblock()->bh_read(), unfortunately, the
>> read IO is also failed, so the error handling in
>> journal_fail_superblock() unexpectedly clear the journal->j_sb_buffer,
>> finally lead to above NULL pointer dereference issue.
>>
>> If the journal super block had been read and verified, there is no need
>> to call bh_read() read it again even if it has been failed to written
>> out. So the fix could be simply move buffer_verified(bh) in front of
>> bh_read(). Also remove a stale comment left in
>> jbd2_journal_check_used_features().
>>
>> Fixes: 51bacdba23d8 ("jbd2: factor out journal initialization from journal_get_superblock()")
>> Reported-by: Theodore Ts'o <tytso@....edu>
>> Signed-off-by: Zhang Yi <yi.zhang@...wei.com>
>
> This works as a workaround. It is a bit kludgy but for now I guess it is
> good enough. Thanks for the fix and feel free to add:
>
Thanks for the review. Yes, I suppose it's better to find a way to adjust
the sequence of journal load and feature checking in ocfs2_check_volume(),
so that we could completely remove the journal_get_superblock() in
jbd2_journal_check_used_features().
Thanks,
Yi.
>
>
>> ---
>> v1->v2:
>> - Remove a stale comment left in jbd2_journal_check_used_features().
>>
>> fs/jbd2/journal.c | 7 +++----
>> 1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
>> index b5e57735ab3f..559242df0f9a 100644
>> --- a/fs/jbd2/journal.c
>> +++ b/fs/jbd2/journal.c
>> @@ -1919,6 +1919,9 @@ static int journal_get_superblock(journal_t *journal)
>> bh = journal->j_sb_buffer;
>>
>> J_ASSERT(bh != NULL);
>> + if (buffer_verified(bh))
>> + return 0;
>> +
>> err = bh_read(bh, 0);
>> if (err < 0) {
>> printk(KERN_ERR
>> @@ -1926,9 +1929,6 @@ static int journal_get_superblock(journal_t *journal)
>> goto out;
>> }
>>
>> - if (buffer_verified(bh))
>> - return 0;
>> -
>> sb = journal->j_superblock;
>>
>> err = -EINVAL;
>> @@ -2229,7 +2229,6 @@ int jbd2_journal_check_used_features(journal_t *journal, unsigned long compat,
>>
>> if (!compat && !ro && !incompat)
>> return 1;
>> - /* Load journal superblock if it is not loaded yet. */
>> if (journal_get_superblock(journal))
>> return 0;
>> if (!jbd2_format_support_feature(journal))
>> --
>> 2.39.2
>>
Powered by blists - more mailing lists