[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7077c905-2a19-46f2-9f45-d82ed673d48b@huawei.com>
Date: Mon, 18 Nov 2024 09:29:19 +0800
From: Baokun Li <libaokun1@...wei.com>
To: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
CC: Ritesh Harjani <ritesh.list@...il.com>, <linux-ext4@...r.kernel.org>, Jan
Kara <jack@...e.com>, <linux-kernel@...r.kernel.org>,
<linux-fsdevel@...r.kernel.org>, Disha Goel <disgoel@...ux.ibm.com>, Yang
Erkun <yangerkun@...wei.com>
Subject: Re: [PATCH 1/1] quota: flush quota_release_work upon quota writeback
On 2024/11/17 1:59, Ojaswin Mujoo wrote:
> On Sat, Nov 16, 2024 at 02:20:26AM +0530, Ritesh Harjani wrote:
>> Ojaswin Mujoo <ojaswin@...ux.ibm.com> writes:
>>
>>> One of the paths quota writeback is called from is:
>>>
>>> freeze_super()
>>> sync_filesystem()
>>> ext4_sync_fs()
>>> dquot_writeback_dquots()
>>>
>>> Since we currently don't always flush the quota_release_work queue in
>>> this path, we can end up with the following race:
>>>
>>> 1. dquot are added to releasing_dquots list during regular operations.
>>> 2. FS freeze starts, however, this does not flush the quota_release_work queue.
>>> 3. Freeze completes.
>>> 4. Kernel eventually tries to flush the workqueue while FS is frozen which
>>> hits a WARN_ON since transaction gets started during frozen state:
>>>
>>> ext4_journal_check_start+0x28/0x110 [ext4] (unreliable)
>>> __ext4_journal_start_sb+0x64/0x1c0 [ext4]
>>> ext4_release_dquot+0x90/0x1d0 [ext4]
>>> quota_release_workfn+0x43c/0x4d0
>>>
>>> Which is the following line:
>>>
>>> WARN_ON(sb->s_writers.frozen == SB_FREEZE_COMPLETE);
>>>
>>> Which ultimately results in generic/390 failing due to dmesg
>>> noise. This was detected on powerpc machine 15 cores.
>>>
>>> To avoid this, make sure to flush the workqueue during
>>> dquot_writeback_dquots() so we dont have any pending workitems after
>>> freeze.
>> Not just that, sync_filesystem can also be called from other places and
>> quota_release_workfn() could write out and and release the dquot
>> structures if such are found during processing of releasing_dquots list.
>> IIUC, this was earlier done in the same dqput() context but had races
>> with dquot_mark_dquot_dirty(). Hence the final dqput() will now add the
>> dquot structures to releasing_dquots list and will schedule a delayed
>> workfn which will process the releasing_dquots list.
> Hi Ritesh,
>
> Ohh right, thanks for the context. I see this was done here:
>
> dabc8b207566 quota: fix dqput() to follow the guarantees dquot_srcu
> should provide
>
> Which went in v6.5. Let me cc Baokun as well.
Hello Ojaswin,
Nice catch! Thanks for fixing this up!
Have you tested the performance impact of this patch? It looks like the
unconditional call to flush_delayed_work() in dquot_writeback_dquots()
may have some performance impact for frequent chown/sync scenarios.
When calling release_dquot(), we will only remove the quota of an object
(user/group/project) from disk if it is not quota-limited and does not
use any inode or block.
Asynchronous removal is now much more performance friendly, not only does
it make full use of the multi-core, but for scenarios where we have to
repeatedly chown between two objects, delayed release avoids the need to
repeatedly allocate/free space in memory and on disk.
Overall, since the actual dirty data is already on the disk, there is no
consistency issue here as it is just clearing unreferenced quota on the
disk, so I thought maybe it would be better to call flush_delayed_work()
in the freeze context.
Thanks,
Baokun
>> And so after the final dqput and before the release_workfn gets
>> scheduled, if dquot gets marked as dirty or dquot_transfer gets called -
>> then I am suspecting that it could lead to a dirty or an active dquot.
>>
>> Hence, flushing the delayed quota_release_work at the end of
>> dquot_writeback_dquots() looks like the right thing to do IMO.
>>
>> But I can give another look as this part of the code is not that well
>> known to me.
>>
>>> Reported-by: Disha Goel <disgoel@...ux.ibm.com>
>>> Signed-off-by: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
>>> ---
>> Maybe a fixes tag as well?
> Right, I'll add that in the next revision. I believe it would be:
>
> Fixes: dabc8b207566 ("quota: fix dqput() to follow the guarantees dquot_srcu should provide")
>
> Regards,
> ojaswin
>
>>> fs/quota/dquot.c | 2 ++
>>> 1 file changed, 2 insertions(+)
>>>
>>> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
>>> index 3dd8d6f27725..2782cfc8c302 100644
>>> --- a/fs/quota/dquot.c
>>> +++ b/fs/quota/dquot.c
>>> @@ -729,6 +729,8 @@ int dquot_writeback_dquots(struct super_block *sb, int type)
>>> sb->dq_op->write_info(sb, cnt);
>>> dqstats_inc(DQST_SYNCS);
>>>
>>> + flush_delayed_work("a_release_work);
>>> +
>>> return ret;
>>> }
>>> EXPORT_SYMBOL(dquot_writeback_dquots);
>>> --
>>> 2.43.5
Powered by blists - more mailing lists