[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1d43599f-fed1-b37e-a411-2b0f31583991@gmail.com>
Date: Wed, 19 May 2021 09:27:56 +0800
From: Wang Jianchao <jianchao.wan9@...il.com>
To: "Theodore Y. Ts'o" <tytso@....edu>
Cc: Andreas Dilger <adilger.kernel@...ger.ca>,
linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] ext4: get discard out of jbd2 commit kthread
On 2021/5/18 10:57 PM, Theodore Y. Ts'o wrote:
> On Tue, May 18, 2021 at 09:19:13AM +0800, Wang Jianchao wrote:
>>> That way we don't need to move all of this to a kworker context.
>>
>> The submit_bio also needs to be out of jbd2 commit kthread as it may be
>> blocked due to blk-wbt or no enough request tag. ;)
>
> Actually, there's a bigger deal that I hadn't realized, about why we
> is why are currently using submit_bio_wait(). We *must* wait until
> discard has completed before we call ext4_free_data_in_buddy(), which
> is what allows those blocks to be reused by the block allocator.
>
> If the discard happens after we reallocate the block, there is a good
> chance that we will end up corrupting a data or metadata block,
> leading to user data loss.
Yes
>
> There's another corollary to this; if you use blk-wbt, and you are
> doing lots of deletes, and we move this all to a writeback thread,
> this *significantly* increases the chance that the user will see
> ENOSPC errors in the case where they are with a very full (close to
> 100% used) file system.
We would flush the kwork that's doing discard in this patch.
That's done in ext4_should_retry_alloc()
>
> I'd argue that this is a *really* good reason why using mount -o
> discard is Just A Bad Idea if you are running with blk-wbt. If
> discards are slow, using fstrim is a much better choice. It's also
> the case that for most SSD's and workloads, doing frequent discards
> doesn't actually help that much. The write endurance of the device is
> not compromised that much if you only run fs-trim and discard unused
> blocks once a day, or even once a week --- I only recommend use of
> mount -o discard in cases where the discard operation is effectively
> free. (e.g., in cases where the FTL is implemented on the Host OS, or
> you are running with super-fast flash which is PCIe or NVMe attached.)
We're running ext4 with discard on a nbd device whose backend is storage
cluster. The discard can help to free the unused space to storage pool.
And sometimes application delete a lot of data and discard is flooding.
Then we see the jbd2 commit kthread is blocked for a long time. Even
move the discard out of jbd2, we still see the write IO of jbd2 log
could be blocked. blk-wbt could help to relieve this. Finally the delay
is shift to allocation path. But this is better than blocking the page
fault path which holds the read mm->mmap_sem.
Best regards
Jianchao
>
> Cheers,
>
> - Ted
>
Powered by blists - more mailing lists