[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230226223403.GU360264@dread.disaster.area>
Date: Mon, 27 Feb 2023 09:34:03 +1100
From: Dave Chinner <david@...morbit.com>
To: Helge Deller <deller@....de>
Cc: Pengfei Xu <pengfei.xu@...el.com>,
linux-xfs <linux-xfs@...r.kernel.org>, asml.silence@...il.com,
geert@...ux-m68k.org, linux-kernel@...r.kernel.org,
heng.su@...el.com
Subject: Re: [Syzkaller & bisect] There is "xfs_dquot_alloc" related BUG in
v6.2 in guest
On Sat, Feb 25, 2023 at 08:58:25PM +0100, Helge Deller wrote:
> Looping in xfs mailing list as this seems to be a XFS problem...
> On 2/24/23 05:39, Pengfei Xu wrote:
> > [ 71.225966] XFS (loop1): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
> > [ 71.226310] xfs filesystem being mounted at /root/syzkaller.qCVHXV/0/file0 supports timestamps until 2038 (0x7fffffff)
> > [ 71.227591] BUG: kernel NULL pointer dereference, address: 00000000000002a8
> > [ 71.227873] #PF: supervisor read access in kernel mode
> > [ 71.228077] #PF: error_code(0x0000) - not-present page
> > [ 71.228280] PGD c313067 P4D c313067 PUD c1fe067 PMD 0
> > [ 71.228494] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [ 71.228673] CPU: 0 PID: 161 Comm: kworker/0:4 Not tainted 6.2.0-c9c3395d5e3d #1
> > [ 71.228961] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > [ 71.229400] Workqueue: xfs-inodegc/loop1 xfs_inodegc_worker
> > [ 71.229626] RIP: 0010:xfs_dquot_alloc+0x95/0x1e0
> > [ 71.229820] Code: 80 15 ad 85 48 c7 c6 7c 6b 92 83 e8 75 0f 6b ff 49 8b 8d 60 01 00 00 44 89 e0 31 d2 48 c7 c6 18 ae 8f 83 48 8d bb 18 02 00 00 <f7> b1 a8 02 2
> > [ 71.230528] RSP: 0018:ffffc90000babc20 EFLAGS: 00010246
> > [ 71.230737] RAX: 0000000000000009 RBX: ffff8880093c98c0 RCX: 0000000000000000
> > [ 71.231014] RDX: 0000000000000000 RSI: ffffffff838fae18 RDI: ffff8880093c9ad8
> > [ 71.231292] RBP: ffffc90000babc48 R08: 0000000000000002 R09: 0000000000000000
> > [ 71.231570] R10: ffffc90000baba80 R11: ffff88800af08d98 R12: 0000000000000009
> > [ 71.231850] R13: ffff88800c4bc000 R14: ffff88800c4bc000 R15: 0000000000000004
> > [ 71.232129] FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
> > [ 71.232441] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 71.232668] CR2: 00000000000002a8 CR3: 000000000a1d2002 CR4: 0000000000770ef0
> > [ 71.232949] PKRU: 55555554
> > [ 71.233061] Call Trace:
> > [ 71.233162] <TASK>
> > [ 71.233254] xfs_qm_dqread+0x46/0x440
> > [ 71.233410] ? xfs_qm_dqget_inode+0x13e/0x500
> > [ 71.233596] xfs_qm_dqget_inode+0x154/0x500
> > [ 71.233774] xfs_qm_dqattach_one+0x142/0x3c0
> > [ 71.233961] xfs_qm_dqattach_locked+0x14a/0x170
> > [ 71.234149] xfs_qm_dqattach+0x52/0x80
> > [ 71.234307] xfs_inactive+0x186/0x340
> > [ 71.234461] xfs_inodegc_worker+0xd3/0x430
> > [ 71.234630] process_one_work+0x3b1/0x960
> > [ 71.234802] worker_thread+0x52/0x660
> > [ 71.234957] ? __pfx_worker_thread+0x10/0x10
> > [ 71.235136] kthread+0x161/0x1a0
> > [ 71.235279] ? __pfx_kthread+0x10/0x10
> > [ 71.235442] ret_from_fork+0x29/0x50
> > [ 71.235602] </TASK>
> > [ 71.235696] Modules linked in:
> > [ 71.235826] CR2: 00000000000002a8
> > [ 71.235964] ---[ end trace 0000000000000000 ]---
Looks like a quota disable race with background inode inactivation
reading in dquots.
Can you test the patch below?
-Dave.
--
Dave Chinner
david@...morbit.com
xfs: quotacheck failure can race with background inode inactivation
From: Dave Chinner <dchinner@...hat.com>
The background inode inactivation can attached dquots to inodes, but
this can race with a foreground quotacheck failure that leads to
disabling quotas and freeing the mp->m_quotainfo structure. The
background inode inactivation then tries to allocate a quota, tries
to dereference mp->m_quotainfo, and crashes like so:
XFS (loop1): Quotacheck: Unsuccessful (Error -5): Disabling quotas.
xfs filesystem being mounted at /root/syzkaller.qCVHXV/0/file0 supports timestamps until 2038 (0x7fffffff)
BUG: kernel NULL pointer dereference, address: 00000000000002a8
....
CPU: 0 PID: 161 Comm: kworker/0:4 Not tainted 6.2.0-c9c3395d5e3d #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Workqueue: xfs-inodegc/loop1 xfs_inodegc_worker
RIP: 0010:xfs_dquot_alloc+0x95/0x1e0
....
Call Trace:
<TASK>
xfs_qm_dqread+0x46/0x440
xfs_qm_dqget_inode+0x154/0x500
xfs_qm_dqattach_one+0x142/0x3c0
xfs_qm_dqattach_locked+0x14a/0x170
xfs_qm_dqattach+0x52/0x80
xfs_inactive+0x186/0x340
xfs_inodegc_worker+0xd3/0x430
process_one_work+0x3b1/0x960
worker_thread+0x52/0x660
kthread+0x161/0x1a0
ret_from_fork+0x29/0x50
</TASK>
....
Prevent this race by flushing all the queued background inode
inactivations pending before purging all the cached dquots when
quotacheck fails.
Reported-by: Pengfei Xu <pengfei.xu@...el.com>
Signed-off-by: Dave Chinner <dchinner@...hat.com>
---
fs/xfs/xfs_qm.c | 40 ++++++++++++++++++++++++++--------------
1 file changed, 26 insertions(+), 14 deletions(-)
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index e2c542f6dcd4..78ca52e55f03 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -1321,15 +1321,14 @@ xfs_qm_quotacheck(
error = xfs_iwalk_threaded(mp, 0, 0, xfs_qm_dqusage_adjust, 0, true,
NULL);
- if (error) {
- /*
- * The inode walk may have partially populated the dquot
- * caches. We must purge them before disabling quota and
- * tearing down the quotainfo, or else the dquots will leak.
- */
- xfs_qm_dqpurge_all(mp);
- goto error_return;
- }
+
+ /*
+ * On error, the inode walk may have partially populated the dquot
+ * caches. We must purge them before disabling quota and tearing down
+ * the quotainfo, or else the dquots will leak.
+ */
+ if (error)
+ goto error_purge;
/*
* We've made all the changes that we need to make incore. Flush them
@@ -1363,10 +1362,8 @@ xfs_qm_quotacheck(
* and turn quotaoff. The dquots won't be attached to any of the inodes
* at this point (because we intentionally didn't in dqget_noattach).
*/
- if (error) {
- xfs_qm_dqpurge_all(mp);
- goto error_return;
- }
+ if (error)
+ goto error_purge;
/*
* If one type of quotas is off, then it will lose its
@@ -1376,7 +1373,7 @@ xfs_qm_quotacheck(
mp->m_qflags &= ~XFS_ALL_QUOTA_CHKD;
mp->m_qflags |= flags;
- error_return:
+error_return:
xfs_buf_delwri_cancel(&buffer_list);
if (error) {
@@ -1395,6 +1392,21 @@ xfs_qm_quotacheck(
} else
xfs_notice(mp, "Quotacheck: Done.");
return error;
+
+error_purge:
+ /*
+ * On error, we may have inodes queued for inactivation. This may try
+ * to attach dquots to the inode before running cleanup operations on
+ * the inode and this can race with the xfs_qm_destroy_quotainfo() call
+ * below that frees mp->m_quotainfo. To avoid this race, flush all the
+ * pending inodegc operations before we purge the dquots from memory,
+ * ensuring that background inactivation is idle whilst we turn off
+ * quotas.
+ */
+ xfs_inodegc_flush(mp);
+ xfs_qm_dqpurge_all(mp);
+ goto error_return;
+
}
/*
Powered by blists - more mailing lists