linux-kernel - [Linux Kernel Bug] possible deadlock in ocfs2_try_to_free_truncate

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CANypQFbt_+i9gRp0aRpg55AC-WNy9QAuKmCkEq4EEjcjgwkShQ@mail.gmail.com>
Date: Tue, 20 Jan 2026 21:32:04 +0800
From: Jiaming Zhang <r772577952@...il.com>
To: linux-kernel@...r.kernel.org
Cc: jlbec@...lplan.org, joseph.qi@...ux.alibaba.com, mark@...heh.com, 
	ocfs2-devel@...ts.linux.dev, syzkaller@...glegroups.com
Subject: [Linux Kernel Bug] possible deadlock in ocfs2_try_to_free_truncate_log

Dear Linux kernel developers and maintainers,

We are writing to report a possible deadlock discovered in the ocfs2
subsystem with our generated syzkaller specifications. This issue is
reproducible on the latest version of linux (v6.19-rc6, commit
24d479d26b25bce5faea3ddd9fa8f3a6c3129ea7). The kernel console output,
kernel config, syzkaller reproducer, and C reproducer are attached to
help with analysis.

Please note that syzbot has reported similar issues before, but it
cannot generate reproducer:
- https://syzkaller.appspot.com/bug?id=5f4444e2c7d7202350517772f84e7ed70d97c9a3
- https://syzkaller.appspot.com/bug?extid=c535cfdd86331295512d

The report from kernel (formatted by syz-symbolize), root cause
analysis, and potential patch are listed below:

---

loop0: detected capacity change from 0 to 32768
JBD2: Ignoring recovery information on journal
ocfs2: Mounting device (7,0) on (node local, slot 0) with ordered data mode.

============================================
WARNING: possible recursive locking detected
6.19.0-rc6 #34 Not tainted
--------------------------------------------
repro.out/9774 is trying to acquire lock:
ffff8880531e3480
(&ocfs2_sysfile_lock_key[TRUNCATE_LOG_SYSTEM_INODE]){+.+.}-{4:4}, at:
inode_lock include/linux/fs.h:1027 [inline]
ffff8880531e3480
(&ocfs2_sysfile_lock_key[TRUNCATE_LOG_SYSTEM_INODE]){+.+.}-{4:4}, at:
ocfs2_try_to_free_truncate_log+0xaf/0x360 fs/ocfs2/alloc.c:6132

but task is already holding lock:
ffff8880531e3480
(&ocfs2_sysfile_lock_key[TRUNCATE_LOG_SYSTEM_INODE]){+.+.}-{4:4}, at:
inode_lock include/linux/fs.h:1027 [inline]
ffff8880531e3480
(&ocfs2_sysfile_lock_key[TRUNCATE_LOG_SYSTEM_INODE]){+.+.}-{4:4}, at:
ocfs2_defrag_extent fs/ocfs2/move_extents.c:253 [inline]
ffff8880531e3480
(&ocfs2_sysfile_lock_key[TRUNCATE_LOG_SYSTEM_INODE]){+.+.}-{4:4}, at:
__ocfs2_move_extents_range fs/ocfs2/move_extents.c:862 [inline]
ffff8880531e3480
(&ocfs2_sysfile_lock_key[TRUNCATE_LOG_SYSTEM_INODE]){+.+.}-{4:4}, at:
ocfs2_move_extents+0x160b/0x3dd0 fs/ocfs2/move_extents.c:937

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&ocfs2_sysfile_lock_key[TRUNCATE_LOG_SYSTEM_INODE]);
  lock(&ocfs2_sysfile_lock_key[TRUNCATE_LOG_SYSTEM_INODE]);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

5 locks held by repro.out/9774:
 #0: ffff8880461e6420 (sb_writers#11){.+.+}-{0:0}, at:
mnt_want_write_file+0x60/0x200 fs/namespace.c:543
 #1: ffff8880531e42c0 (&sb->s_type->i_mutex_key#20){+.+.}-{4:4}, at:
inode_lock include/linux/fs.h:1027 [inline]
 #1: ffff8880531e42c0 (&sb->s_type->i_mutex_key#20){+.+.}-{4:4}, at:
ocfs2_move_extents+0x1fb/0x3dd0 fs/ocfs2/move_extents.c:915
 #2: ffff8880531e3f60 (&oi->ip_alloc_sem){+.+.}-{4:4}, at:
ocfs2_move_extents+0x3b3/0x3dd0 fs/ocfs2/move_extents.c:935
 #3: ffff8880531b6d80
(&ocfs2_sysfile_lock_key[EXTENT_ALLOC_SYSTEM_INODE]){+.+.}-{4:4}, at:
inode_lock include/linux/fs.h:1027 [inline]
 #3: ffff8880531b6d80
(&ocfs2_sysfile_lock_key[EXTENT_ALLOC_SYSTEM_INODE]){+.+.}-{4:4}, at:
ocfs2_reserve_suballoc_bits+0x15e/0x45c0 fs/ocfs2/suballoc.c:789
 #4: ffff8880531e3480
(&ocfs2_sysfile_lock_key[TRUNCATE_LOG_SYSTEM_INODE]){+.+.}-{4:4}, at:
inode_lock include/linux/fs.h:1027 [inline]
 #4: ffff8880531e3480
(&ocfs2_sysfile_lock_key[TRUNCATE_LOG_SYSTEM_INODE]){+.+.}-{4:4}, at:
ocfs2_defrag_extent fs/ocfs2/move_extents.c:253 [inline]
 #4: ffff8880531e3480
(&ocfs2_sysfile_lock_key[TRUNCATE_LOG_SYSTEM_INODE]){+.+.}-{4:4}, at:
__ocfs2_move_extents_range fs/ocfs2/move_extents.c:862 [inline]
 #4: ffff8880531e3480
(&ocfs2_sysfile_lock_key[TRUNCATE_LOG_SYSTEM_INODE]){+.+.}-{4:4}, at:
ocfs2_move_extents+0x160b/0x3dd0 fs/ocfs2/move_extents.c:937

stack backtrace:
CPU: 1 UID: 0 PID: 9774 Comm: repro.out Not tainted 6.19.0-rc6 #34 PREEMPT(full)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x10e/0x190 lib/dump_stack.c:120
 print_deadlock_bug+0x2ae/0x2c0 kernel/locking/lockdep.c:3041
 check_deadlock kernel/locking/lockdep.c:3093 [inline]
 validate_chain+0x908/0x23e0 kernel/locking/lockdep.c:3895
 __lock_acquire+0xad0/0xd50 kernel/locking/lockdep.c:5237
 lock_acquire+0x107/0x340 kernel/locking/lockdep.c:5868
 down_write+0x96/0x1f0 kernel/locking/rwsem.c:1590
 inode_lock include/linux/fs.h:1027 [inline]
 ocfs2_try_to_free_truncate_log+0xaf/0x360 fs/ocfs2/alloc.c:6132
 ocfs2_reserve_clusters_with_limit+0x3c2/0xba0 fs/ocfs2/suballoc.c:1187
 ocfs2_defrag_extent fs/ocfs2/move_extents.c:272 [inline]
 __ocfs2_move_extents_range fs/ocfs2/move_extents.c:862 [inline]
 ocfs2_move_extents+0x196d/0x3dd0 fs/ocfs2/move_extents.c:937
 ocfs2_ioctl_move_extents+0x56e/0x740 fs/ocfs2/move_extents.c:1069
 ocfs2_ioctl+0x191/0x750 fs/ocfs2/ioctl.c:942
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xe8/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x450239
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 14 00 00 90 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffe2c31add8 EFLAGS: 00000213 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000004004a0 RCX: 0000000000450239
RDX: 0000200000000240 RSI: 0000000040406f06 RDI: 0000000000000006
RBP: 00007ffe2c31adf0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000213 R12: 0000000000409aa0
R13: 0000000000000000 R14: 00000000004bf018 R15: 00000000004004a0
 </TASK>

---

The root cause of this issue is a recursive locking issue. In the
function ocfs2_defrag_extent(), it calls ocfs2_reserve_clusters()
while it holds lock for osb->osb_tl_inode. In the following call
chain:

ocfs2_defrag_extent() ->
ocfs2_reserve_clusters() ->
ocfs2_reserve_clusters_with_limit() ->
ocfs2_try_to_free_truncate_log()

ocfs2_try_to_free_truncate_log() attempts to acquire
osb->osb_tsl_inode lock again and leading to a deadlock
(https://github.com/torvalds/linux/blob/v6.19-rc6/fs/ocfs2/alloc.c#L6132)
since inode_lock is not reentrant.

We can fix this issue by reducing the critical section of
osb->osb_tl_inode, i.e. making it primarily protect
__ocfs2_flush_truncate_log(). Once flush operation is complete, unlock
osb->osb_tl_inode and then execute ocfs2_reserve_clusters(), which can
make it acquire the lock safely, without causing possible deadlock:
```
--- a/fs/ocfs2/move_extents.c
+++ b/fs/ocfs2/move_extents.c
@@ -256,10 +256,13 @@ static int ocfs2_defrag_extent(struct
ocfs2_move_extents_context *context,
    ret = __ocfs2_flush_truncate_log(osb);
    if (ret < 0) {
      mlog_errno(ret);
-     goto out_unlock_mutex;
+     inode_unlock(tl_inode);
+     goto out_no_unlock;
    }
  }

+ inode_unlock(tl_inode);
+
  /*
   * Make sure ocfs2_reserve_cluster is called after
   * __ocfs2_flush_truncate_log, otherwise, dead lock may happen.
@@ -272,14 +275,14 @@ static int ocfs2_defrag_extent(struct
ocfs2_move_extents_context *context,
  ret = ocfs2_reserve_clusters(osb, *len, &context->data_ac);
  if (ret) {
    mlog_errno(ret);
-   goto out_unlock_mutex;
+   goto out_no_unlock;
  }

  handle = ocfs2_start_trans(osb, credits);
  if (IS_ERR(handle)) {
    ret = PTR_ERR(handle);
    mlog_errno(ret);
-   goto out_unlock_mutex;
+   goto out_no_unlock;
  }

  ret = __ocfs2_claim_clusters(handle, context->data_ac, 1, *len,
@@ -341,9 +344,7 @@ static int ocfs2_defrag_extent(struct
ocfs2_move_extents_context *context,

  ocfs2_commit_trans(osb, handle);

-out_unlock_mutex:
- inode_unlock(tl_inode);
-
+out_no_unlock:
  if (context->data_ac) {
    ocfs2_free_alloc_context(context->data_ac);
    context->data_ac = NULL;
```

After applying the above changes, we ran the reproducer for ~15
minutes without triggering any issues.

If this solution is acceptable, we are happy to submit a patch :)

Please let me know if any further information is required.

Best Regards,
Jiaming Zhang

View attachment "repro.c" of type "text/plain" (100625 bytes)

Download attachment "kernel.log" of type "application/octet-stream" (187461 bytes)

Download attachment ".config" of type "application/xml" (273663 bytes)

Download attachment "report" of type "application/octet-stream" (4955 bytes)

Download attachment "repro.syz" of type "application/octet-stream" (24268 bytes)