[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191023165554.GG31271@quack2.suse.cz>
Date: Wed, 23 Oct 2019 18:55:54 +0200
From: Jan Kara <jack@...e.cz>
To: "Theodore Y. Ts'o" <tytso@....edu>
Cc: Jan Kara <jack@...e.cz>, linux-ext4@...r.kernel.org
Subject: Re: [PATCH 05/22] ext4: Fix ext4_should_journal_data() for EA inodes
On Sun 20-10-19 21:38:42, Theodore Y. Ts'o wrote:
> On Fri, Oct 04, 2019 at 12:05:51AM +0200, Jan Kara wrote:
> > Similarly to directories, EA inodes do only journalled modifications to
> > their data. Change ext4_should_journal_data() to return true for them so
> > that we don't have to special-case them during truncate.
>
> We are already special-casing EA inodes in ext4_clear_blocks() in
> fs/ext4/indirect.c, and get_default_free_blocks_flags() in
> fs/ext4/extents.c, and like S_ISDIR, we want to treat EA inode blocks
> as metadata. So I'm not sure I see the value of this change?
Firstly, ext4_should_journal_data() should tell whether inode's data blocks
are modified through journalling. So as a principle of least surprise it
should return true for EA inodes because that's how data blocks of those
inodes are modified.
Secondly, once ext4_should_journal_data() is fixed by this patch, I think
that we can just drop that special-casing from ext4_clear_blocks() and
get_default_free_blocks_flags() and just have there:
if (ext4_should_journal_data(inode))
flags |= EXT4_FREE_BLOCKS_FORGET;
> As an aside, I was looking at fs/ext4/mballoc.c to see what the
> difference is for treating a block as a metadata block versus a
> journaled data block, and what I found made my hair rise on end:
>
> /*
> * We need to make sure we don't reuse the freed block until after the
> * transaction is committed. We make an exception if the inode is to be
> * written in writeback mode since writeback mode has weak data
> * consistency guarantees.
> */
>
> So in data=writeback, if a file is deleted, its blocks are available
> for immediate reallocation, and if we are under heavy memory pressure,
> the deleted file's blocks could get overwritten --- even in the case
> where we crash and the transaction never committed.
>
> While it's true that date=writeback mode has weaker guarantees, my
> understanding is that it only applied to the exposure stale data, and
> not to a long-standing file's blocks getting corrupted if it is almost
> deleted, but not quite before a crash.
>
> Granted, the situation where this would happen is quite wrare, but it
> seems quite wrong....
I've always considered data=writeback as: You don't know what the data is
going to be if the file was touched shortly before crashing (i.e., similar
to old ext2 non-guarantees).
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists