[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191021013842.GF6799@mit.edu>
Date: Sun, 20 Oct 2019 21:38:42 -0400
From: "Theodore Y. Ts'o" <tytso@....edu>
To: Jan Kara <jack@...e.cz>
Cc: linux-ext4@...r.kernel.org
Subject: Re: [PATCH 05/22] ext4: Fix ext4_should_journal_data() for EA inodes
On Fri, Oct 04, 2019 at 12:05:51AM +0200, Jan Kara wrote:
> Similarly to directories, EA inodes do only journalled modifications to
> their data. Change ext4_should_journal_data() to return true for them so
> that we don't have to special-case them during truncate.
We are already special-casing EA inodes in ext4_clear_blocks() in
fs/ext4/indirect.c, and get_default_free_blocks_flags() in
fs/ext4/extents.c, and like S_ISDIR, we want to treat EA inode blocks
as metadata. So I'm not sure I see the value of this change?
As an aside, I was looking at fs/ext4/mballoc.c to see what the
difference is for treating a block as a metadata block versus a
journaled data block, and what I found made my hair rise on end:
/*
* We need to make sure we don't reuse the freed block until after the
* transaction is committed. We make an exception if the inode is to be
* written in writeback mode since writeback mode has weak data
* consistency guarantees.
*/
So in data=writeback, if a file is deleted, its blocks are available
for immediate reallocation, and if we are under heavy memory pressure,
the deleted file's blocks could get overwritten --- even in the case
where we crash and the transaction never committed.
While it's true that date=writeback mode has weaker guarantees, my
understanding is that it only applied to the exposure stale data, and
not to a long-standing file's blocks getting corrupted if it is almost
deleted, but not quite before a crash.
Granted, the situation where this would happen is quite wrare, but it
seems quite wrong....
- Ted
Powered by blists - more mailing lists