[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260203131407.GA27241@macsyma.lan>
Date: Tue, 3 Feb 2026 08:14:07 -0500
From: "Theodore Tso" <tytso@....edu>
To: Zhang Yi <yi.zhang@...weicloud.com>
Cc: Christoph Hellwig <hch@...radead.org>, linux-ext4@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
adilger.kernel@...ger.ca, jack@...e.cz, ojaswin@...ux.ibm.com,
ritesh.list@...il.com, djwong@...nel.org,
Zhang Yi <yi.zhang@...wei.com>, yizhang089@...il.com,
libaokun1@...wei.com, yangerkun@...wei.com,
yukuai@...-78bjiv52429oh8qptp.cn-shenzhen.alb.aliyuncs.com
Subject: Re: [PATCH -next v2 00/22] ext4: use iomap for regular file's
buffered I/O path
On Tue, Feb 03, 2026 at 05:18:10PM +0800, Zhang Yi wrote:
> This means that the ordered journal mode is no longer in ext4 used
> under the iomap infrastructure. The main reason is that iomap
> processes each folio one by one during writeback. It first holds the
> folio lock and then starts a transaction to create the block mapping.
> If we still use the ordered mode, we need to perform writeback in
> the logging process, which may require initiating a new transaction,
> potentially leading to deadlock issues. In addition, ordered journal
> mode indeed has many synchronization dependencies, which increase
> the risk of deadlocks, and I believe this is one of the reasons why
> ext4_do_writepages() is implemented in such a complicated manner.
> Therefore, I think we need to give up using the ordered data mode.
>
> Currently, there are three scenarios where the ordered mode is used:
> 1) append write,
> 2) partial block truncate down, and
> 3) online defragmentation.
>
> For append write, we can always allocate unwritten blocks to avoid
> using the ordered journal mode.
This is going to be a pretty severe performance regression, since it
means that we will be doubling the journal load for append writes.
What we really need to do here is to first write out the data blocks,
and then only start the transaction handle to modify the data blocks
*after* the data blocks have been written (to heretofore, unused
blocks that were just allocated). It means inverting the order in
which we write data blocks for the append write case, and in fact it
will improve fsync() performance since we won't be gating writing the
commit block on the date blocks getting written out in the append
write case.
Cheers,
- Ted
Powered by blists - more mailing lists