linux-kernel - Re: [RFC PATCH] fs/hfs: fix ABBA deadlock in hfs_mdb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aWcHhTiUrDppotRg@ndev>
Date: Wed, 14 Jan 2026 11:03:33 +0800
From: Jinchao Wang <wangjinchao600@...il.com>
To: Viacheslav Dubeyko <Slava.Dubeyko@....com>
Cc: "glaubitz@...sik.fu-berlin.de" <glaubitz@...sik.fu-berlin.de>,
	"frank.li@...o.com" <frank.li@...o.com>,
	"slava@...eyko.com" <slava@...eyko.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	"syzbot+1e3ff4b07c16ca0f6fe2@...kaller.appspotmail.com" <syzbot+1e3ff4b07c16ca0f6fe2@...kaller.appspotmail.com>
Subject: Re: [RFC PATCH] fs/hfs: fix ABBA deadlock in hfs_mdb_commit

On Tue, Jan 13, 2026 at 08:52:45PM +0000, Viacheslav Dubeyko wrote:
> On Tue, 2026-01-13 at 16:19 +0800, Jinchao Wang wrote:
> > syzbot reported a hung task in hfs_mdb_commit where a deadlock occurs
> > between the MDB buffer lock and the folio lock.
> > 
> > The deadlock happens because hfs_mdb_commit() holds the mdb_bh
> > lock while calling sb_bread(), which attempts to acquire the lock
> > on the same folio.
> 
> I don't quite to follow to your logic. We have only one sb_bread() [1] in
> hfs_mdb_commit(). This read is trying to extract the volume bitmap. How is it
> possible that superblock and volume bitmap is located at the same folio? Are you
> sure? Which size of the folio do you imply here?
> 
> Also, it your logic is correct, then we never could be able to mount/unmount or
> run any operations on HFS volumes because of likewise deadlock. However, I can
> run xfstests on HFS volume.
> 
> [1] https://elixir.bootlin.com/linux/v6.19-rc5/source/fs/hfs/mdb.c#L324

Hi Viacheslav,

After reviewing your feedback, I realized that my previous RFC was not in
the correct format. It was not intended to be a final, merge-ready patch,
but rather a record of the analysis and trial fixes conducted so far.
I apologize for the confusion caused by my previous email.

The details are reorganized as follows:

- Observation
- Analysis
- Verification
- Conclusion

Observation
============

Syzbot report: https://syzkaller.appspot.com/bug?extid=1e3ff4b07c16ca0f6fe2

For this version:
| time             |  kernel    | Commit       | Syzkaller |
| 2025/12/20 17:03 | linux-next | cc3aa43b44bd | d6526ea3  |

Crash log: https://syzkaller.appspot.com/text?tag=CrashLog&x=12909b1a580000

The report indicates hung tasks within the hfs context.

Analysis
========
In the crash log, the lockdep information requires adjustment based on the call stack.
After adjustment, a deadlock is identified:

task syz.1.1902:8009
- held &disk->open_mutex
- held foio lock
- wait lock_buffer(bh)
Partial call trace:
->blkdev_writepages()
        ->writeback_iter()
                ->writeback_get_folio()
                        ->folio_lock(folio)
        ->block_write_full_folio()
                __block_write_full_folio()
                        ->lock_buffer(bh)

task syz.0.1904:8010
- held &type->s_umount_key#66 down_read
- held lock_buffer(HFS_SB(sb)->mdb_bh);
- wait folio
Partial call trace:
hfs_mdb_commit
        ->lock_buffer(HFS_SB(sb)->mdb_bh);
        ->bh = sb_bread(sb, block);
                ...->folio_lock(folio)


Other hung tasks are secondary effects of this deadlock. The issue
is reproducible in my local environment usuing the syz-reproducer.

Verification
==============

Two patches are verified against the syz-reproducer.
Neither reproduce the deadlock.

Option 1: Removing `un/lock_buffer(HFS_SB(sb)->mdb_bh)`
------------------------------------------------------

diff --git a/fs/hfs/mdb.c b/fs/hfs/mdb.c
index 53f3fae60217..c641adb94e6f 100644
--- a/fs/hfs/mdb.c
+++ b/fs/hfs/mdb.c
@@ -268,7 +268,6 @@ void hfs_mdb_commit(struct super_block *sb)
        if (sb_rdonly(sb))
                return;

-       lock_buffer(HFS_SB(sb)->mdb_bh);
        if (test_and_clear_bit(HFS_FLG_MDB_DIRTY, &HFS_SB(sb)->flags)) {
                /* These parameters may have been modified, so write them back */
                mdb->drLsMod = hfs_mtime();
@@ -340,7 +339,6 @@ void hfs_mdb_commit(struct super_block *sb)
                        size -= len;
                }
        }
-       unlock_buffer(HFS_SB(sb)->mdb_bh);
 }


Options 2: Moving `unlock_buffer(HFS_SB(sb)->mdb_bh)`
--------------------------------------------------------

diff --git a/fs/hfs/mdb.c b/fs/hfs/mdb.c
index 53f3fae60217..ec534c630c7e 100644
--- a/fs/hfs/mdb.c
+++ b/fs/hfs/mdb.c
@@ -309,6 +309,7 @@ void hfs_mdb_commit(struct super_block *sb)
                sync_dirty_buffer(HFS_SB(sb)->alt_mdb_bh);
        }
 
+       unlock_buffer(HFS_SB(sb)->mdb_bh);
        if (test_and_clear_bit(HFS_FLG_BITMAP_DIRTY, &HFS_SB(sb)->flags)) {
                struct buffer_head *bh;
                sector_t block;
@@ -340,7 +341,6 @@ void hfs_mdb_commit(struct super_block *sb)
                        size -= len;
                }
        }
-       unlock_buffer(HFS_SB(sb)->mdb_bh);
 }

Conclusion
==========

The analysis and verification confirms that the hung tasks are caused by
the deadlock between `lock_buffer(HFS_SB(sb)->mdb_bh)` and `sb_bread(sb, block)`.