[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <08c67287-53d8-58ce-e4bb-b1656bc6013e@huawei.com>
Date: Mon, 27 Mar 2023 19:17:48 +0800
From: Zhang Yi <yi.zhang@...wei.com>
To: Chung-Chiang Cheng <shepjeng@...il.com>, Jan Kara <jack@...e.cz>
CC: Chung-Chiang Cheng <cccheng@...ology.com>,
<linux-ext4@...r.kernel.org>, <tytso@....edu>,
<adilger.kernel@...ger.ca>, <kernel@...heng.net>,
Robbie Ko <robbieko@...ology.com>
Subject: Re: [PATCH] ext4: defer updating i_disksize until endio
On 2023/3/27 18:28, Chung-Chiang Cheng wrote:
> On Mon, Mar 27, 2023 at 5:29 PM Jan Kara <jack@...e.cz> wrote:
>>
>> As Zhang Yi already noted in his review, this is expected at least with
>> data=writeback mount option. With data=ordered this should not happen
>> though as the commit of the transaction with i_disksize update will wait
>> for page writeback to complete (this is exactly the reason why data=ordered
>> exists after all). Are you able to observe this problem with data=ordered
>> mount option?
>>
>> Honza
>
> It's a pity that this issue also occurs with data=ordered due to delayed
> allocation being enabled by default. If delayed allocation were disabled,
> it would not be as easy to reproduce.
>
> This is because if data is written to the end of a file and the block is
> allocated, the new i_disksize will be immediately committed to the journal
> at ext4_da_write_end(), but the writeback procedure is not yet triggered.
> By default, ext4 commits the journal every 5 seconds, but a dirty page may
> not be written back until 30 seconds later. This is not a short time window,
> and any improper shutdown during this time may lead to the issue :(
>
It seems that the case you've mentioned is intra-block append write (no?),
current data=ordered mount option doesn't work in this case because
ext4_map_blocks() doesn't attach inode to the t_inode_list of the running
transaction. If delayed allocation were disabled, the lose data window is still
there, because ext4_write_end()->ext4_update_inode_size() is also updating
i_disksize before writing data back. This is at least guarantee no store data.
We had discussed this in [1].
[1]. https://lore.kernel.org/linux-ext4/1554370192-113254-1-git-send-email-yi.zhang@huawei.com/
Thanks,
Yi.
Powered by blists - more mailing lists