[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6604f73f-15f8-688d-a361-19503ffa9cf0@huawei.com>
Date: Sat, 12 Jan 2019 17:32:21 +0800
From: "zhangyi (F)" <yi.zhang@...wei.com>
To: Eryu Guan <guaneryu@...il.com>
CC: <linux-ext4@...r.kernel.org>, <tytso@....edu>,
<adilger.kernel@...ger.ca>, <jack@...e.cz>, <miaoxie@...wei.com>
Subject: Re: [PATCH] jbd2: set freed flag while revoking a buffer which
belongs to older transaction
On 2019/1/12 15:39, Eryu Guan Wrote:
> On Thu, Jan 10, 2019 at 02:12:02PM +0800, zhangyi (F) wrote:
>> Now, we capture a data corruption problem on ext4 while we're truncating
>> an extent index block. Imaging that if we are revoking a buffer which
>> has been journaled by the committing transaction, the buffer's jbddirty
>> flag will not be cleared in jbd2_journal_forget(), so the commit code
>> will set the buffer dirty flag again after refile the buffer.
>>
>> fsx kjournald2
>> jbd2_journal_commit_transaction
>> jbd2_journal_revoke commit phase 1~5...
>> jbd2_journal_forget
>> belongs to older transaction commit phase 6
>> jbddirty not clear __jbd2_journal_refile_buffer
>> __jbd2_journal_unfile_buffer
>> test_clear_buffer_jbddirty
>> mark_buffer_dirty
>>
>> Finally, if the freed extent index block was allocated again as data
>> block by some other files, it may corrupt the file data when writing
>> cached pages later, such as during umount time.
>>
>> This patch mark buffer as freed when it already belongs to the
>> committing transaction in jbd2_journal_forget(), so that commit code
>> knows it should clear dirty bits when it is done with the buffer.
>>
>> This problem can be reproduced by xfstests generic/455 easily with
>> seeds (3246 3247 3248 3249).
>
> Would you please capture the fsx ops sequences that could reproduce the
> problem and replay it in a targeted regression test, like what
> generic/{499,511} do? Thanks!
>
Yes, I will do it. But this problem is timing dependent, so I am afraid
this targeted regression test cannot always reproduce it (not even
generic/455 with above seeds).
BTW, we only test and capture this problem on ext4, I am not sure other
file systems have the same problem or not. So better to categorize this
test to tests/ext4 group?
Thanks,
Yi.
Powered by blists - more mailing lists