[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180702203047.GE30481@thunk.org>
Date: Mon, 2 Jul 2018 16:30:47 -0400
From: "Theodore Y. Ts'o" <tytso@....edu>
To: Andreas Dilger <adilger@...ger.ca>
Cc: Lukas Czerner <lczerner@...hat.com>, linux-ext4@...r.kernel.org
Subject: Re: [PATCH] e2fsck: do not allow initialized blocks pass i_size
On Fri, Jun 29, 2018 at 01:35:41PM -0600, Andreas Dilger wrote:
> >>> Right. So there are two choices:
> >>>
> >>> 1) Keep the blocks beyond i_size marked as uninitialized. You
> >>> transfer and write the full PAGE_SIZE of data, but it simply will
> >>> never be available to the user.
> >
> > Yes, that's for extent mapped files.
> >
> >>> 2) Zero the page, write it out to the file, and then extend i_size and
> >>> mark the extents as uninitialized.
> >
> > Except at that point you do not really need to mark the extent as
> > unitialized, the blocks are allocated and written to and i_size is
> > extended. That's how it needs to be done for indirect block mapped
> > files.
> >>> Why is it that Lustre is choosing to keep i_size where it is, but to
> >>> mark the blocks beyond it as initialized?
> >>
> >> This isn't about initialized vs. uninitialized extents. It is only about
> >> allocated vs. unallocated blocks, possibly with block-mapped files. There
> >> is no way to have uninitialized blocks with a block-mapped file.
Does Lustre really support block-mapped files today? If so, why?
And if it must support block-mapped files and not just only
extent-mapped files, is there any reason why Lustre can just make sure
(a) there are no blocks allocated past i_size --- ext4 can handle this
case just fine, even if that means there are parts of the page which
are not mapped to a block. Alternatively, (b) if (a) is impossible,
to simply make sure i_size is moved to page_size boundary and all of
the allocated blocks are zero'ed if they haven't been written yet?
> Like I said previously, this is done with Lustre, which has a different IO
> submission path than stock ext4. I don't think there is any requirement that
> this only be in upstream ext4, since e2fsprogs also has code to support running
> on BSD, Windows, even Hurd.
If neither (a) or (b) is possible, I'm willing to entertain this. If
we have to go down that path, then we it should be something that
should be configured, perhaps via /etc/e2fsck.conf. The reason for
this is Lustre really is minority use case; and it is *useful* for
e2fsck to flag cases where there are initialized blocks past, i_size,
since it should never happen with the Linux stack. And if it does,
it's a bug, and we should (for example) flag it when running xfstests.
So I think what I'm going to do for 1.44.3 is to take Lukas's patch.
We can possibly put it back under some kind of conditional, either via
e2fsck.conf, or via some kind of superblock flag. Or it can be
something that can be patched back in for the Lustre fork of
e2fsprogs.
- Ted
Powered by blists - more mailing lists