linux-kernel - Re: [RFC PATCH] fs/hfs: fix ABBA deadlock in hfs_mdb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aXA9j9oQLHAHPP46@ndev>
Date: Wed, 21 Jan 2026 10:44:31 +0800
From: Jinchao Wang <wangjinchao600@...il.com>
To: Viacheslav Dubeyko <Slava.Dubeyko@....com>
Cc: "glaubitz@...sik.fu-berlin.de" <glaubitz@...sik.fu-berlin.de>,
	"frank.li@...o.com" <frank.li@...o.com>,
	"slava@...eyko.com" <slava@...eyko.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
	"syzbot+1e3ff4b07c16ca0f6fe2@...kaller.appspotmail.com" <syzbot+1e3ff4b07c16ca0f6fe2@...kaller.appspotmail.com>
Subject: Re: [RFC PATCH] fs/hfs: fix ABBA deadlock in hfs_mdb_commit

On Tue, Jan 20, 2026 at 08:51:06PM +0000, Viacheslav Dubeyko wrote:
> On Tue, 2026-01-20 at 09:09 +0800, Jinchao Wang wrote:
> > 
> 
> <skipped>
> 
> > > 
> > > Firs of all, I've tried to check the syzbot report that you are mentioning in
> > > the patch. And I was confused because it was report for FAT. So, I don't see the
> > > way how I can reproduce the issue on my side.
> > > 
> > > Secondly, I need to see the real call trace of the issue. This discussion
> > > doesn't make sense without the reproduction path and the call trace(s) of the
> > > issue.
> > > 
> > > Thanks,
> > > Slava.
> > There are many crash in the syz report page, please follow the specified time and version.
> > 
> > Syzbot report: https://syzkaller.appspot.com/bug?extid=1e3ff4b07c16ca0f6fe2  
> > 
> > For this version:
> > > time             |  kernel    | Commit       | Syzkaller |
> > > 2025/12/20 17:03 | linux-next | cc3aa43b44bd | d6526ea3  |
> > 
> > The full call trace can be found in the crash log of "2025/12/20 17:03", which url is:
> > 
> > Crash log: https://syzkaller.appspot.com/text?tag=CrashLog&x=12909b1a580000  
> 
> This call trace is dedicated to flushing inode's dirty pages in page cache, as
> far as I can see:
> 
> [  504.401993][   T31] INFO: task kworker/u8:1:13 blocked for more than 143
> seconds.
> [  504.434587][   T31]       Not tainted syzkaller #0
> [  504.441437][   T31] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  504.451145][   T31] task:kworker/u8:1    state:D stack:22792 pid:13   
> tgid:13    ppid:2      task_flags:0x4208060 flags:0x00080000
> [  504.463591][   T31] Workqueue: writeback wb_workfn (flush-7:4)
> [  504.471997][   T31] Call Trace:
> [  504.475502][   T31]  <TASK>
> ...
> [  504.805695][   T31]  </TASK>
> 
> And this call trace is dedicated to superblock commit: 
> 
> [  505.186758][   T31] INFO: task kworker/1:4:5971 blocked for more than 144
> seconds.
> [  505.194752][ T8014] Bluetooth: hci37: command tx timeout
> [  505.210267][   T31]       Not tainted syzkaller #0
> [  505.215260][   T31] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  505.273687][   T31] task:kworker/1:4     state:D stack:24152 pid:5971 
> tgid:5971  ppid:2      task_flags:0x4208060 flags:0x00080000
> [  505.287569][   T31] Workqueue: events_long flush_mdb
> [  505.293762][   T31] Call Trace:
> [  505.297607][   T31]  <TASK>
> ...
> [  505.570372][   T31]  </TASK>
> 
> I don't see any relation between folios in inode's page cache and HFS_SB(sb)-
> >mdb_bh because they cannot share the same folio. 
What you pasted are not the right tasks. Please see this analysis which I sent before
and focus on the task id 8009 and 8010.

Analysis
========
In the crash log, the lockdep information requires adjustment based on the call stack.
After adjustment, a deadlock is identified:

** task syz.1.1902:8009 **
- held &disk->open_mutex
- held foio lock
- wait lock_buffer(bh)
Partial call trace:
->blkdev_writepages()
        ->writeback_iter()
                ->writeback_get_folio()
                        ->folio_lock(folio)
        ->block_write_full_folio()
                __block_write_full_folio()
                        ->lock_buffer(bh)

task syz.0.1904:8010
- held &type->s_umount_key#66 down_read
- held lock_buffer(HFS_SB(sb)->mdb_bh);
- wait folio
Partial call trace:
hfs_mdb_commit
        ->lock_buffer(HFS_SB(sb)->mdb_bh);
        ->bh = sb_bread(sb, block);
                ...->folio_lock(folio)


Other hung tasks are secondary effects of this deadlock. The issue
is reproducible in my local environment usuing the syz-reproducer.

> I still don't see from your
> explanation how the issue could happen. I don't see how lock_buffer(HFS_SB(sb)-
> >mdb_bh) can be responsible for the issue. 

> Oppositely, if we follow to your
> logic, then we never can be able to mount any HFS volume. But xfstests works for
> HFS file systems (of course, multiple tests fail) and I cannot see the deadlock
> for common situation. So, you need to explain which particular use-case can
> reproduce the issue and what is mechanism of deadlock happening.
> 

Please follow what I sent and do the reproduce. 
Have you ever try the specified time and version in the syz report page?

| time             |  kernel    | Commit       | Syzkaller |
| 2025/12/20 17:03 | linux-next | cc3aa43b44bd | d6526ea3  |

-- 
Thanks,
Jinchao