linux-kernel - [PATCH 1/2] ext4: only defer sb update on error if SB

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <da8af2e5170f0d94031b812d7d50c6ec1967db1b.1740212945.git.ojaswin@linux.ibm.com>
Date: Sat, 22 Feb 2025 14:10:22 +0530
From: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
To: linux-ext4@...r.kernel.org, "Theodore Ts'o" <tytso@....edu>
Cc: Jan Kara <jack@...e.cz>, linux-kernel@...r.kernel.org,
        Mahesh Kumar <maheshkumar657g@...il.com>,
        Ritesh Harjani <ritesh.list@...il.com>
Subject: [PATCH 1/2] ext4: only defer sb update on error if SB_ACTIVE

Presently we always BUG_ON if trying to start a transaction on a journal
marked with JBD2_UNMOUNT, since this should never happen. However while
running stress tests it was observed that in case of some error handling
paths, it is possible for update_super_work to start a transaction after
the journal is destroyed eg:

(umount)
ext4_kill_sb
  kill_block_super
    generic_shutdown_super
      sync_filesystem /* commits all txns */
      evict_inodes
        /* might start a new txn */
      ext4_put_super
	flush_work(&sbi->s_sb_upd_work) /* flush the workqueue */
        jbd2_journal_destroy
          journal_kill_thread
            journal->j_flags |= JBD2_UNMOUNT;
          jbd2_journal_commit_transaction
            jbd2_journal_get_descriptor_buffer
              jbd2_journal_bmap
                ext4_journal_bmap
                  ext4_map_blocks
                    ...
                    ext4_inode_error
                      ext4_handle_error
                        schedule_work(&sbi->s_sb_upd_work)

                                               /* work queue kicks in */
                                               update_super_work
                                                 jbd2_journal_start
                                                   start_this_handle
                                                     BUG_ON(journal->j_flags &
                                                            JBD2_UNMOUNT)

Hence, make sure we only defer the update of ext4 sb if the sb is still
active.  Otherwise, just fallback to an un-journaled commit.

The important thing to note here is that we must only defer sb update if
we have not yet flushed the s_sb_update_work queue in umount path else
this race can be hit (point 1 below). Since we don't have a direct way
to check for that we use SB_ACTIVE instead. The SB_ACTIVE check is a bit
subtle so adding some notes below for future reference:

1. Ideally we would want to have a something like (flags & JBD2_UNMOUNT
== 0) however this is not correct since we could end up scheduling work
after it has been flushed:

 ext4_put_super
  flush_work(&sbi->s_sb_upd_work)

                           **kjournald2**
                           jbd2_journal_commit_transaction
                           ...
                           ext4_inode_error
                             /* JBD2_UNMOUNT not set */
                             schedule_work(s_sb_upd_work)

   jbd2_journal_destroy
    journal->j_flags |= JBD2_UNMOUNT;

                                      **workqueue**
                                      update_super_work
                                       jbd2_journal_start
                                        start_this_handle
                                          BUG_ON(JBD2_UNMOUNT)

Something like the above doesn't happen with SB_ACTIVE check because we
are sure that the workqueue would be flushed at a later point if we are
in the umount path.

2. We don't need a similar check in ext4_grp_locked_error since it is
only called from mballoc and AFAICT it would be always valid to schedule
work here.

Fixes: 2d01ddc86606 ("ext4: save error info to sb through journal if available")
Reported-by: Mahesh Kumar <maheshkumar657g@...il.com>
Suggested-by: Ritesh Harjani <ritesh.list@...il.com>
Signed-off-by: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
---
 fs/ext4/super.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index a963ffda692a..b7341e9acf62 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -706,7 +706,7 @@ static void ext4_handle_error(struct super_block *sb, bool force_ro, int error,
 		 * constraints, it may not be safe to do it right here so we
 		 * defer superblock flushing to a workqueue.
 		 */
-		if (continue_fs && journal)
+		if (continue_fs && journal && (sb->s_flags & SB_ACTIVE))
 			schedule_work(&EXT4_SB(sb)->s_sb_upd_work);
 		else
 			ext4_commit_super(sb);
-- 
2.48.1