[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <87ttmef3fp.fsf@doe.com>
Date: Mon, 12 Feb 2024 14:46:10 +0530
From: Ritesh Harjani (IBM) <ritesh.list@...il.com>
To: "Darrick J. Wong" <djwong@...nel.org>, Zhang Yi <yi.zhang@...weicloud.com>
Cc: linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org, linux-mm@...ck.org, linux-kernel@...r.kernel.org, tytso@....edu, adilger.kernel@...ger.ca, jack@...e.cz, hch@...radead.org, willy@...radead.org, zokeefe@...gle.com, yi.zhang@...wei.com, chengzhihao1@...wei.com, yukuai3@...wei.com, wangkefeng.wang@...wei.com
Subject: Re: [RFC PATCH v3 00/26] ext4: use iomap for regular file's buffered IO path and enable large foilo
"Darrick J. Wong" <djwong@...nel.org> writes:
> On Sat, Jan 27, 2024 at 09:57:59AM +0800, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@...wei.com>
>>
>> Hello,
>>
>> This is the third version of RFC patch series that convert ext4 regular
>> file's buffered IO path to iomap and enable large folio. It's rebased on
>> 6.7 and Christoph's "map multiple blocks per ->map_blocks in iomap
>> writeback" series [1]. I've fixed all issues found in the last about 3
>> weeks of stress tests and fault injection tests in v2. I hope I've
>> covered most of the corner cases, and any comments are welcome. :)
>>
>> Changes since v2:
>> - Update patch 1-6 to v3 [2].
>> - iomap_zero and iomap_unshare don't need to update i_size and call
>> iomap_write_failed(), introduce a new helper iomap_write_end_simple()
>> to avoid doing that.
>> - Factor out ext4_[ext|ind]_map_blocks() parts from ext4_map_blocks(),
>> introduce a new helper ext4_iomap_map_one_extent() to allocate
>> delalloc blocks in writeback, which is always under i_data_sem in
>> write mode. This is done to prevent the writing back delalloc
>> extents become stale if it raced by truncate.
>> - Add a lock detection in mapping_clear_large_folios().
>> Changes since v1:
>> - Introduce seq count for iomap buffered write and writeback to protect
>> races from extents changes, e.g. truncate, mwrite.
>> - Always allocate unwritten extents for new blocks, drop dioread_lock
>> mode, and make no distinctions between dioread_lock and
>> dioread_nolock.
>> - Don't add ditry data range to jinode, drop data=ordered mode, and
>> make no distinctions between data=ordered and data=writeback mode.
>> - Postpone updating i_disksize to endio.
>> - Allow splitting extents and use reserved space in endio.
>> - Instead of reimplement a new delayed mapping helper
>> ext4_iomap_da_map_blocks() for buffer write, try to reuse
>> ext4_da_map_blocks().
>> - Add support for disabling large folio on active inodes.
>> - Support online defragmentation, make file fall back to buffer_head
>> and disable large folio in ext4_move_extents().
>> - Move ext4_nonda_switch() in advance to prevent deadlock in mwrite.
>> - Add dirty_len and pos trace info to trace_iomap_writepage_map().
>> - Update patch 1-6 to v2.
>>
>> This series only support ext4 with the default features and mount
>> options, doesn't support inline_data, bigalloc, dax, fs_verity, fs_crypt
>> and data=journal mode, ext4 would fall back to buffer_head path
>
> Do you plan to add bigalloc or !extents support as a part 2 patchset?
>
Hi Darrick,
> An ext2 port to iomap has been (vaguely) in the works for a while,
yes, we have [1][2]. I am in the process of rebasing that work on the latest
upstream. It's been a while since my last post since I have been pulled
into some other internal work, sorry about that.
> though iirc willy never got the performance to match because iomap
Ohh, can you help me provide details on what performance benchmark was
run? I can try and run them when I rebase.
> didn't have a mechanism for the caller to tell it "run the IO now even
> though you don't have a complete page, because the indirect block is the
> next block after the 11th block".
Do you mean this for a large folio? I still didn't get the problem you
are referring here. Can you please help me explain why could that be a
problem?
[1]: https://lore.kernel.org/linux-ext4/9cdd449fc1d63cf2dba17cfa2fa7fb29b8f96a46.1700506526.git.ritesh.list@gmail.com/
[2]: https://lore.kernel.org/linux-ext4/8734wnj53k.fsf@doe.com/
-ritesh
Powered by blists - more mailing lists