[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0349430786e4553845c30490e19b08451c8b999f.camel@ibm.com>
Date: Mon, 19 Jan 2026 18:09:16 +0000
From: Viacheslav Dubeyko <Slava.Dubeyko@....com>
To: "wangjinchao600@...il.com" <wangjinchao600@...il.com>
CC: "glaubitz@...sik.fu-berlin.de" <glaubitz@...sik.fu-berlin.de>,
"frank.li@...o.com" <frank.li@...o.com>,
"slava@...eyko.com"
<slava@...eyko.com>,
"linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>,
"linux-fsdevel@...r.kernel.org"
<linux-fsdevel@...r.kernel.org>,
"syzbot+1e3ff4b07c16ca0f6fe2@...kaller.appspotmail.com"
<syzbot+1e3ff4b07c16ca0f6fe2@...kaller.appspotmail.com>
Subject: RE: [RFC PATCH] fs/hfs: fix ABBA deadlock in hfs_mdb_commit
On Fri, 2026-01-16 at 16:10 +0800, Jinchao Wang wrote:
> On Thu, Jan 15, 2026 at 09:12:49PM +0000, Viacheslav Dubeyko wrote:
> > On Thu, 2026-01-15 at 11:34 +0800, Jinchao Wang wrote:
> > > On Wed, Jan 14, 2026 at 07:29:45PM +0000, Viacheslav Dubeyko wrote:
> > > > On Wed, 2026-01-14 at 11:03 +0800, Jinchao Wang wrote:
> > > > > On Tue, Jan 13, 2026 at 08:52:45PM +0000, Viacheslav Dubeyko wrote:
> > > > > > On Tue, 2026-01-13 at 16:19 +0800, Jinchao Wang wrote:
> > > > > > > syzbot reported a hung task in hfs_mdb_commit where a deadlock occurs
> > > > > > > between the MDB buffer lock and the folio lock.
> > > > > > >
> > > > > > > The deadlock happens because hfs_mdb_commit() holds the mdb_bh
> > > > > > > lock while calling sb_bread(), which attempts to acquire the lock
> > > > > > > on the same folio.
> > > > > >
> > > > > > I don't quite to follow to your logic. We have only one sb_bread() [1] in
> > > > > > hfs_mdb_commit(). This read is trying to extract the volume bitmap. How is it
> > > > > > possible that superblock and volume bitmap is located at the same folio? Are you
> > > > > > sure? Which size of the folio do you imply here?
> > > > > >
> > > > > > Also, it your logic is correct, then we never could be able to mount/unmount or
> > > > > > run any operations on HFS volumes because of likewise deadlock. However, I can
> > > > > > run xfstests on HFS volume.
> > > > > >
> > > > > > [1] https://elixir.bootlin.com/linux/v6.19-rc5/source/fs/hfs/mdb.c#L324
> > > > >
> > > > > Hi Viacheslav,
> > > > >
> > > > > After reviewing your feedback, I realized that my previous RFC was not in
> > > > > the correct format. It was not intended to be a final, merge-ready patch,
> > > > > but rather a record of the analysis and trial fixes conducted so far.
> > > > > I apologize for the confusion caused by my previous email.
> > > > >
> > > > > The details are reorganized as follows:
> > > > >
> > > > > - Observation
> > > > > - Analysis
> > > > > - Verification
> > > > > - Conclusion
> > > > >
> > > > > Observation
> > > > > ============
> > > > >
> > > > > Syzbot report: https://syzkaller.appspot.com/bug?extid=1e3ff4b07c16ca0f6fe2
> > > > >
> > > > > For this version:
> > > > > > time | kernel | Commit | Syzkaller |
> > > > > > 2025/12/20 17:03 | linux-next | cc3aa43b44bd | d6526ea3 |
> > > > >
> > > > > Crash log: https://syzkaller.appspot.com/text?tag=CrashLog&x=12909b1a580000
> > > > >
> > > > > The report indicates hung tasks within the hfs context.
> > > > >
> > > > > Analysis
> > > > > ========
> > > > > In the crash log, the lockdep information requires adjustment based on the call stack.
> > > > > After adjustment, a deadlock is identified:
> > > > >
> > > > > task syz.1.1902:8009
> > > > > - held &disk->open_mutex
> > > > > - held foio lock
> > > > > - wait lock_buffer(bh)
> > > > > Partial call trace:
> > > > > ->blkdev_writepages()
> > > > > ->writeback_iter()
> > > > > ->writeback_get_folio()
> > > > > ->folio_lock(folio)
> > > > > ->block_write_full_folio()
> > > > > __block_write_full_folio()
> > > > > ->lock_buffer(bh)
> > > > >
> > > > > task syz.0.1904:8010
> > > > > - held &type->s_umount_key#66 down_read
> > > > > - held lock_buffer(HFS_SB(sb)->mdb_bh);
> > > > > - wait folio
> > > > > Partial call trace:
> > > > > hfs_mdb_commit
> > > > > ->lock_buffer(HFS_SB(sb)->mdb_bh);
> > > > > ->bh = sb_bread(sb, block);
> > > > > ...->folio_lock(folio)
> > > > >
> > > > >
> > > > > Other hung tasks are secondary effects of this deadlock. The issue
> > > > > is reproducible in my local environment usuing the syz-reproducer.
> > > > >
> > > > > Verification
> > > > > ==============
> > > > >
> > > > > Two patches are verified against the syz-reproducer.
> > > > > Neither reproduce the deadlock.
> > > > >
> > > > > Option 1: Removing `un/lock_buffer(HFS_SB(sb)->mdb_bh)`
> > > > > ------------------------------------------------------
> > > > >
> > > > > diff --git a/fs/hfs/mdb.c b/fs/hfs/mdb.c
> > > > > index 53f3fae60217..c641adb94e6f 100644
> > > > > --- a/fs/hfs/mdb.c
> > > > > +++ b/fs/hfs/mdb.c
> > > > > @@ -268,7 +268,6 @@ void hfs_mdb_commit(struct super_block *sb)
> > > > > if (sb_rdonly(sb))
> > > > > return;
> > > > >
> > > > > - lock_buffer(HFS_SB(sb)->mdb_bh);
> > > > > if (test_and_clear_bit(HFS_FLG_MDB_DIRTY, &HFS_SB(sb)->flags)) {
> > > > > /* These parameters may have been modified, so write them back */
> > > > > mdb->drLsMod = hfs_mtime();
> > > > > @@ -340,7 +339,6 @@ void hfs_mdb_commit(struct super_block *sb)
> > > > > size -= len;
> > > > > }
> > > > > }
> > > > > - unlock_buffer(HFS_SB(sb)->mdb_bh);
> > > > > }
> > > > >
> > > > >
> > > > > Options 2: Moving `unlock_buffer(HFS_SB(sb)->mdb_bh)`
> > > > > --------------------------------------------------------
> > > > >
> > > > > diff --git a/fs/hfs/mdb.c b/fs/hfs/mdb.c
> > > > > index 53f3fae60217..ec534c630c7e 100644
> > > > > --- a/fs/hfs/mdb.c
> > > > > +++ b/fs/hfs/mdb.c
> > > > > @@ -309,6 +309,7 @@ void hfs_mdb_commit(struct super_block *sb)
> > > > > sync_dirty_buffer(HFS_SB(sb)->alt_mdb_bh);
> > > > > }
> > > > >
> > > > > + unlock_buffer(HFS_SB(sb)->mdb_bh);
> > > > > if (test_and_clear_bit(HFS_FLG_BITMAP_DIRTY, &HFS_SB(sb)->flags)) {
> > > > > struct buffer_head *bh;
> > > > > sector_t block;
> > > > > @@ -340,7 +341,6 @@ void hfs_mdb_commit(struct super_block *sb)
> > > > > size -= len;
> > > > > }
> > > > > }
> > > > > - unlock_buffer(HFS_SB(sb)->mdb_bh);
> > > > > }
> > > > >
> > > > > Conclusion
> > > > > ==========
> > > > >
> > > > > The analysis and verification confirms that the hung tasks are caused by
> > > > > the deadlock between `lock_buffer(HFS_SB(sb)->mdb_bh)` and `sb_bread(sb, block)`.
> > > >
> > > > First of all, we need to answer this question: How is it
> > > > possible that superblock and volume bitmap is located at the same folio or
> > > > logical block? In normal case, the superblock and volume bitmap should not be
> > > > located in the same logical block. It sounds to me that you have corrupted
> > > > volume and this is why this logic [1] finally overlap with superblock location:
> > > >
> > > > block = be16_to_cpu(HFS_SB(sb)->mdb->drVBMSt) + HFS_SB(sb)->part_start;
> > > > off = (block << HFS_SECTOR_SIZE_BITS) & (sb->s_blocksize - 1);
> > > > block >>= sb->s_blocksize_bits - HFS_SECTOR_SIZE_BITS;
> > > >
> > > > I assume that superblock is corrupted and the mdb->drVBMSt [2] has incorrect
> > > > metadata. As a result, we have this deadlock situation. The fix should be not
> > > > here but we need to add some sanity check of mdb->drVBMSt somewhere in
> > > > hfs_fill_super() workflow.
> > > >
> > > > Could you please check my vision?
> > > >
> > > > Thanks,
> > > > Slava.
> > > >
> > > > [1] https://elixir.bootlin.com/linux/v6.19-rc5/source/fs/hfs/mdb.c#L318
> > > > [2]
> > > > https://elixir.bootlin.com/linux/v6.19-rc5/source/include/linux/hfs_common.h#L196
> > >
> > > Hi Slava,
> > >
> > > I have traced the values during the hang. Here are the values observed:
> > >
> > > - MDB: blocknr=2
> > > - Volume Bitmap (drVBMSt): 3
> > > - s_blocksize: 512 bytes
> > >
> > > This confirms a circular dependency between the folio lock and
> > > the buffer lock. The writeback thread holds the 4KB folio lock and
> > > waits for the MDB buffer lock (block 2). Simultaneously, the HFS sync
> > > thread holds the MDB buffer lock and waits for the same folio lock
> > > to read the bitmap (block 3).
> > >
> > >
> > > Since block 2 and block 3 share the same folio, this locking
> > > inversion occurs. I would appreciate your thoughts on whether
> > > hfs_fill_super() should validate drVBMSt to ensure the bitmap
> > > does not reside in the same folio as the MDB.
> >
> >
> > As far as I can see, I can run xfstest on HFS volume (for example, generic/001
> > has been finished successfully):
> >
> > sudo ./check -g auto -E ./my_exclude.txt
> > FSTYP -- hfs
> > PLATFORM -- Linux/x86_64 hfsplus-testing-0001 6.19.0-rc1+ #56 SMP
> > PREEMPT_DYNAMIC Thu Jan 15 12:55:22 PST 2026
> > MKFS_OPTIONS -- /dev/loop51
> > MOUNT_OPTIONS -- /dev/loop51 /mnt/scratch
> >
> > generic/001 36s ... 36s
> >
> > 2026-01-15T13:00:07.589868-08:00 hfsplus-testing-0001 kernel: run fstests
> > generic/001 at 2026-01-15 13:00:07
> > 2026-01-15T13:00:07.661605-08:00 hfsplus-testing-0001 systemd[1]: Started
> > fstests-generic-001.scope - /usr/bin/bash -c "test -w /proc/self/oom_score_adj
> > && echo 250 > /proc/self/oom_score_adj; exec ./tests/generic/001".
> > 2026-01-15T13:00:13.355795-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():296 HFS_SB(sb)->mdb_bh buffer has been locked
> > 2026-01-15T13:00:13.355809-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():348 drVBMSt 3, part_start 0, off 0, block 3, size 8167
> > 2026-01-15T13:00:13.355810-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.355810-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.355811-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.355812-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.355812-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.355812-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.355813-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.355813-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.355813-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.355814-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.355814-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.355815-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.355815-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.355815-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.355816-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.355816-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.355816-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.355816-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.355817-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.355818-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.355818-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.355818-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.355819-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.355819-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.355819-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.355819-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.355820-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.355820-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.355821-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.355821-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.355821-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.355822-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.355822-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():383 HFS_SB(sb)->mdb_bh buffer has been unlocked
> > 2026-01-15T13:00:13.681527-08:00 hfsplus-testing-0001 systemd[1]: fstests-
> > generic-001.scope: Deactivated successfully.
> > 2026-01-15T13:00:13.681597-08:00 hfsplus-testing-0001 systemd[1]: fstests-
> > generic-001.scope: Consumed 5.928s CPU time.
> > 2026-01-15T13:00:13.714928-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():296 HFS_SB(sb)->mdb_bh buffer has been locked
> > 2026-01-15T13:00:13.714942-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():348 drVBMSt 3, part_start 0, off 0, block 3, size 8167
> > 2026-01-15T13:00:13.714943-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.714944-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.714944-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.714944-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.714945-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.714945-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.714946-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.714946-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.714947-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.714947-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.714947-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.714948-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.714948-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.714948-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.714949-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.714949-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.714950-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.714950-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.714950-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.714951-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.714951-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.714952-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.714952-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.714952-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.714953-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.714953-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.714953-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.714954-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.714954-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.714955-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.714955-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():356 start read volume bitmap block
> > 2026-01-15T13:00:13.714955-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():370 volume bitmap block has been read and copied
> > 2026-01-15T13:00:13.714956-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():383 HFS_SB(sb)->mdb_bh buffer has been unlocked
> > 2026-01-15T13:00:13.716742-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():296 HFS_SB(sb)->mdb_bh buffer has been locked
> > 2026-01-15T13:00:13.716754-08:00 hfsplus-testing-0001 kernel: hfs:
> > hfs_mdb_commit():383 HFS_SB(sb)->mdb_bh buffer has been unlocked
> > 2026-01-15T13:00:13.722184-08:00 hfsplus-testing-0001 systemd[1]: mnt-
> > test.mount: Deactivated successfully.
> >
> > And I don't see any issues with locking into the added debug output. I don't see
> > the reproduction of reported deadlock. And the logic of hfs_mdb_commit() correct
> > enough.
> >
> > The main question is: how blkdev_writepages() can collide with hfs_mdb_commit()?
> > I assume that blkdev_writepages() is trying to flush the user data. So, what is
> > the problem here? Is it allocation issue? Does it mean that some file was not
> > properly allocated? Or does it mean that superblock commit somehow collided with
> > user data flush? But how does it possible? Which particular workload could have
> > such issue?
> >
> > Currently, your analysis doesn't show what problem is and how it is happened.
> >
> > Thanks,
> > Slava.
>
> Hi Slava,
>
> Thank you very much for your feedback and for taking the time to
> review this. I apologize if my previous analysis was not clear
> enough. As I am relatively new to this area, I truly appreciate
> your patience.
>
> After further tracing, I would like to share more details on how the
> collision between blkdev_writepages() and hfs_mdb_commit() occurs.
> It appears to be a timing-specific race condition.
>
> 1. Physical Overlap (The "How"):
> In my environment, the HFS block size is 512B and the MDB is located
> at block 2 (offset 1024). Since 1024 < 4096, the MDB resides
> within the block device's first folio (index 0).
> Consequently, both the filesystem layer (via mdb_bh) and the block
> layer (via bdev mapping) operate on the exact same folio at index 0.
>
> 2. The Race Window (The "Why"):
> The collision is triggered by the global nature of ksys_sync(). In
> a system with multiple mounted devices, there is a significant time
> gap between Stage 1 (iterate_supers) and Stage 2 (sync_bdevs). This
> window allows a concurrent task to dirty the MDB folio after one
> sync task has already passed its FS-sync stage.
>
> 3. Proposed Reproduction Timeline:
> - Task A: Starts ksys_sync() and finishes iterate_supers()
> for the HFS device. It then moves on to sync other devices.
> - Task B: Creates a new file on HFS, then starts its
> own ksys_sync().
> - Task B: Enters hfs_mdb_commit(), calls lock_buffer(mdb_bh) and
> mark_buffer_dirty(mdb_bh). This makes folio 0 dirty.
> - Task A: Finally reaches sync_bdevs() for the HFS device. It sees
> folio 0 is dirty, calls folio_lock(folio), and then attempts
> to lock_buffer(mdb_bh) for I/O.
> - Task A: Blocks waiting for mdb_bh lock (held by Task B).
> - Task B: Continues hfs_mdb_commit() -> sb_bread(), which attempts
> to lock folio 0 (held by Task A).
>
> This results in an AB-BA deadlock between the Folio Lock and the
> Buffer Lock.
>
> I hope this clarifies why the collision is possible even though
> hfs_mdb_commit() seems correct in isolation. It is the concurrent
> interleaving of FS-level and BDEV-level syncs that triggers the
> violation of the Folio -> Buffer locking order.
>
> I would be very grateful for your thoughts on this updated analysis.
>
>
Firs of all, I've tried to check the syzbot report that you are mentioning in
the patch. And I was confused because it was report for FAT. So, I don't see the
way how I can reproduce the issue on my side.
Secondly, I need to see the real call trace of the issue. This discussion
doesn't make sense without the reproduction path and the call trace(s) of the
issue.
Thanks,
Slava.
Powered by blists - more mailing lists