[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZRytn6CxFK2oECUt@debian-BULLSEYE-live-builder-AMD64>
Date: Tue, 3 Oct 2023 20:11:11 -0400
From: Eric Whitney <enwlinux@...il.com>
To: linux-ext4@...r.kernel.org
Cc: jack@...e.cz, libaokun1@...wei.com
Subject: probable quota bug introduced in 6.6-rc1
When run on my test hardware, generic/270 triggers hung task timeouts when
run on a 6.6-rc1 (or -rc2, -rc3, -rc4) kernel with kvm-xfstests using the
nojournal test scenario. The test always passes, but about 60% of the time
the running time of the test increases by an order of magnitude or more and
one or more of the hung task timeout warnings included below can be found in
the log.
This does not reproduce on 6.5. Bisection leads to this patch:
dabc8b207566 ("quota: fix dqput() to follow the guarantees dquot_srcu should
provide")
>From the log:
generic/270 306s ... [20:08:45][ 311.322318] run fstests generic/270 at 2023-10-03 20:08:45
[ 311.579641] EXT4-fs (vdc): mounted filesystem d0e542a0-3342-4d43-aa1f-c918cc92aafa r/w without journal. Quota mode: writeback.
[ 311.587978] EXT4-fs (vdc): re-mounted d0e542a0-3342-4d43-aa1f-c918cc92aafa ro. Quota mode: writeback.
[ 311.592725] EXT4-fs (vdc): re-mounted d0e542a0-3342-4d43-aa1f-c918cc92aafa r/w. Quota mode: writeback.
[ 335.491107] 270 (3092): drop_caches: 3
[ 491.167988] INFO: task quotaon:3450 blocked for more than 122 seconds.
[ 491.168334] Not tainted 6.4.0+ #13
[ 491.168544] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 491.168936] task:quotaon state:D stack:0 pid:3450 ppid:3092 flags:0x00004000
[ 491.169363] Call Trace:
[ 491.169503] <TASK>
[ 491.169620] __schedule+0x394/0xd40
[ 491.169813] schedule+0x5d/0xd0
[ 491.169981] schedule_timeout+0x1a7/0x1c0
[ 491.170191] ? lock_release+0x139/0x280
[ 491.170395] ? lock_acquire+0xb9/0x180
[ 491.170605] ? do_raw_spin_unlock+0x4b/0xa0
[ 491.170837] __wait_for_common+0xb6/0x1e0
[ 491.171046] ? __pfx_schedule_timeout+0x10/0x10
[ 491.171324] __flush_work+0x2da/0x430
[ 491.171517] ? __pfx_wq_barrier_func+0x10/0x10
[ 491.171747] ? 0xffffffff81000000
[ 491.171932] dquot_disable+0x3e5/0x670
[ 491.172134] ext4_quota_off+0x50/0x1a0
[ 491.172332] __x64_sys_quotactl+0x87/0x1c0
[ 491.172545] do_syscall_64+0x38/0x90
[ 491.172731] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 491.172990] RIP: 0033:0x7f3c7c79eada
[ 491.173176] RSP: 002b:00007ffed2ff4478 EFLAGS: 00000246 ORIG_RAX: 00000000000000b3
[ 491.173558] RAX: ffffffffffffffda RBX: 000055886a3997d0 RCX: 00007f3c7c79eada
[ 491.173915] RDX: 0000000000000000 RSI: 000055886bf43de0 RDI: 0000000080000301
[ 491.174271] RBP: 000055886bf43de0 R08: 0000000000000001 R09: 0000000000000002
[ 491.174657] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002
[ 491.175014] R13: 000055886bf43ea0 R14: 0000000000000001 R15: 0000000000000000
[ 491.175373] </TASK>
[ 491.175491]
[ 491.175491] Showing all locks held in the system:
[ 491.176706] 1 lock held by rcu_tasks_kthre/12:
[ 491.178126] #0: ffffffff82763970 (rcu_tasks.tasks_gp_mutex){....}-{3:3}, at: rcu_tasks_one_gp+0x30/0x3f0
[ 491.180955] 1 lock held by rcu_tasks_rude_/13:
[ 491.182394] #0: ffffffff827636f0 (rcu_tasks_rude.tasks_gp_mutex){....}-{3:3}, at: rcu_tasks_one_gp+0x30/0x3f0
[ 491.194388] 1 lock held by khungtaskd/26:
[ 491.196153] #0: ffffffff82764020 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0xe/0x110
[ 491.199676] 2 locks held by kworker/u4:4/59:
[ 491.200722] #0: ffff88800385cd38 ((wq_completion)events_unbound){....}-{0:0}, at: process_one_work+0x1f6/0x550
[ 491.201600] #1: ffffc90000513e80 ((quota_release_work).work){....}-{0:0}, at: process_one_work+0x1f6/0x550
[ 491.202746] 1 lock held by quotaon/3450:
[ 491.203184] #0: ffff88800afd60e0 (&type->s_umount_key#33){....}-{3:3}, at: user_get_super+0xd3/0x100
[ 491.204217]
[ 491.204373] =============================================
Eric
Powered by blists - more mailing lists