[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <jdssgnr44c6scnzhpbl7gwgcpo2f25n3cxaaw6fo2uzh3bdwda@ograleyyoyot>
Date: Mon, 5 Jan 2026 17:17:31 +0100
From: Jan Kara <jack@...e.cz>
To: Li Chen <me@...ux.beauty>
Cc: Theodore Ts'o <tytso@....edu>,
Andreas Dilger <adilger.kernel@...ger.ca>, Jan Kara <jack@...e.cz>,
Harshad Shirwadkar <harshadshirwadkar@...il.com>, linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] ext4: fast commit: avoid fs_reclaim inversion in
perform_commit
On Tue 23-12-25 21:13:42, Li Chen wrote:
> lockdep reports a possible deadlock due to lock order inversion:
>
> CPU0 CPU1
> ---- ----
> lock(fs_reclaim);
> lock(&sbi->s_fc_lock);
> lock(fs_reclaim);
> lock(&sbi->s_fc_lock);
>
> ext4_fc_perform_commit() holds s_fc_lock while writing the fast commit
> log. Allocations here can enter reclaim and take fs_reclaim, inverting
> with ext4_fc_del() which runs under fs_reclaim during inode eviction.
> Wrap Step 6 in memalloc_nofs_save()/restore() so reclaim is skipped
> while s_fc_lock is held.
>
> Fixes: 6593714d67ba ("ext4: hold s_fc_lock while during fast commit")
> Signed-off-by: Li Chen <me@...ux.beauty>
Thanks for the analysis and the patch! Your solution is in principle
correct but it's a bit fragile because there can be other instances (or we
can grow them in the future) where sbi->s_fc_lock is held when doing
allocation. The situation is that sbi->s_fc_lock can be acquired from inode
eviction path (ext4_clear_inode()) and thus this lock is inherently reclaim
unsafe. What we do in such cases is that we create helper functions for
acquiring / releasing the lock while also setting proper context and using
these helpers - like in commit 00d873c17e29 ("ext4: avoid deadlock in fs
reclaim with page writeback"). Can you perhaps modify your patch to follow
that behavior as well?
Honza
> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> index 3bcdd4619de1..b0c458082997 100644
> --- a/fs/ext4/fast_commit.c
> +++ b/fs/ext4/fast_commit.c
> @@ -1045,6 +1045,7 @@ static int ext4_fc_perform_commit(journal_t *journal)
> struct ext4_fc_head head;
> struct inode *inode;
> struct blk_plug plug;
> + unsigned int nofs;
> int ret = 0;
> u32 crc = 0;
>
> @@ -1118,6 +1119,7 @@ static int ext4_fc_perform_commit(journal_t *journal)
> blkdev_issue_flush(journal->j_fs_dev);
>
> blk_start_plug(&plug);
> + nofs = memalloc_nofs_save();
> /* Step 6: Write fast commit blocks to disk. */
> if (sbi->s_fc_bytes == 0) {
> /*
> @@ -1158,6 +1160,7 @@ static int ext4_fc_perform_commit(journal_t *journal)
>
> out:
> mutex_unlock(&sbi->s_fc_lock);
> + memalloc_nofs_restore(nofs);
> blk_finish_plug(&plug);
> return ret;
> }
> --
> 2.52.0
>
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists