linux-kernel - Re: [PATCH 1/2] ext4: only defer sb update on error if SB

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3f9a67e2-ef08-47d4-b35e-41841e24bb71@huawei.com>
Date: Tue, 25 Feb 2025 09:53:10 +0800
From: Baokun Li <libaokun1@...wei.com>
To: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
CC: <linux-ext4@...r.kernel.org>, Theodore Ts'o <tytso@....edu>, Jan Kara
	<jack@...e.cz>, <linux-kernel@...r.kernel.org>, Mahesh Kumar
	<maheshkumar657g@...il.com>, Ritesh Harjani <ritesh.list@...il.com>, Yang
 Erkun <yangerkun@...wei.com>
Subject: Re: [PATCH 1/2] ext4: only defer sb update on error if SB_ACTIVE

On 2025/2/22 16:40, Ojaswin Mujoo wrote:
> Presently we always BUG_ON if trying to start a transaction on a journal
> marked with JBD2_UNMOUNT, since this should never happen. However while
> running stress tests it was observed that in case of some error handling
> paths, it is possible for update_super_work to start a transaction after
> the journal is destroyed eg:
>
> (umount)
> ext4_kill_sb
>    kill_block_super
>      generic_shutdown_super
>        sync_filesystem /* commits all txns */
>        evict_inodes
>          /* might start a new txn */
>        ext4_put_super
> 	flush_work(&sbi->s_sb_upd_work) /* flush the workqueue */
>          jbd2_journal_destroy
>            journal_kill_thread
>              journal->j_flags |= JBD2_UNMOUNT;
>            jbd2_journal_commit_transaction
>              jbd2_journal_get_descriptor_buffer
>                jbd2_journal_bmap
>                  ext4_journal_bmap
>                    ext4_map_blocks
>                      ...
>                      ext4_inode_error
Just curious, since jbd2_journal_bmap() only queries the map and does not
create it, how does it fail here? Is there more information in dmesg?
Is s_journal_inum normal after file system corruption?

Thanks,
Baokun
>                        ext4_handle_error
>                          schedule_work(&sbi->s_sb_upd_work)
>
>                                                 /* work queue kicks in */
>                                                 update_super_work
>                                                   jbd2_journal_start
>                                                     start_this_handle
>                                                       BUG_ON(journal->j_flags &
>                                                              JBD2_UNMOUNT)
>
> Hence, make sure we only defer the update of ext4 sb if the sb is still
> active.  Otherwise, just fallback to an un-journaled commit.
>
> The important thing to note here is that we must only defer sb update if
> we have not yet flushed the s_sb_update_work queue in umount path else
> this race can be hit (point 1 below). Since we don't have a direct way
> to check for that we use SB_ACTIVE instead. The SB_ACTIVE check is a bit
> subtle so adding some notes below for future reference:
>
> 1. Ideally we would want to have a something like (flags & JBD2_UNMOUNT
> == 0) however this is not correct since we could end up scheduling work
> after it has been flushed:
>
>   ext4_put_super
>    flush_work(&sbi->s_sb_upd_work)
>
>                             **kjournald2**
>                             jbd2_journal_commit_transaction
>                             ...
>                             ext4_inode_error
>                               /* JBD2_UNMOUNT not set */
>                               schedule_work(s_sb_upd_work)
>
>     jbd2_journal_destroy
>      journal->j_flags |= JBD2_UNMOUNT;
>
>                                        **workqueue**
>                                        update_super_work
>                                         jbd2_journal_start
>                                          start_this_handle
>                                            BUG_ON(JBD2_UNMOUNT)
>
> Something like the above doesn't happen with SB_ACTIVE check because we
> are sure that the workqueue would be flushed at a later point if we are
> in the umount path.
>
> 2. We don't need a similar check in ext4_grp_locked_error since it is
> only called from mballoc and AFAICT it would be always valid to schedule
> work here.
>
> Fixes: 2d01ddc86606 ("ext4: save error info to sb through journal if available")
> Reported-by: Mahesh Kumar <maheshkumar657g@...il.com>
> Suggested-by: Ritesh Harjani <ritesh.list@...il.com>
> Signed-off-by: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
> ---
>   fs/ext4/super.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index a963ffda692a..b7341e9acf62 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -706,7 +706,7 @@ static void ext4_handle_error(struct super_block *sb, bool force_ro, int error,
>   		 * constraints, it may not be safe to do it right here so we
>   		 * defer superblock flushing to a workqueue.
>   		 */
> -		if (continue_fs && journal)
> +		if (continue_fs && journal && (sb->s_flags & SB_ACTIVE))
>   			schedule_work(&EXT4_SB(sb)->s_sb_upd_work);
>   		else
>   			ext4_commit_super(sb);