[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200307023212.GA7845@mit.edu>
Date: Fri, 6 Mar 2020 21:32:12 -0500
From: "Theodore Y. Ts'o" <tytso@....edu>
To: "Darrick J. Wong" <darrick.wong@...cle.com>
Cc: Jan Kara <jack@...e.cz>, Ritesh Harjani <riteshh@...ux.ibm.com>,
linux-ext4@...r.kernel.org, adilger.kernel@...ger.ca,
linux-fsdevel@...r.kernel.org, hch@...radead.org,
cmaiolino@...hat.com, david@...morbit.com
Subject: Re: [PATCHv5 3/6] ext4: Move ext4 bmap to use iomap infrastructure.
On Wed, Mar 04, 2020 at 07:37:45AM -0800, Darrick J. Wong wrote:
> > > This makes me wonder if you still need the filemap_write_and_wait in the
> > > JDATA case because otherwise the journal flush won't have the effect of
> > > writing all the dirty pagecache back to the filesystem? OTOH I suppose
> > > the implicit write-and-wait call after we clear JDATA will not be
> > > writing to the journal.
> > >
> > > Even more weirdly, the FIEMAP code doesn't drop JDATA at all...?
> >
> > Yeah, it should do that but that's only performance optimization so that we
> > bother with journal flushing only when someone uses block mapping call on
> > a file with journalled dirty data. So you can hardly notice the bug by
> > testing...
>
> If we ever decide to deprecate FIBMAP officially and push bootloaders to
> use FIEMAP, then we'll have to emulate all the flushing behaviors. But
> that's something for a separate patch.
This is really only needed for LILO, since I believe this is the only
bootloader which uses the output of FIBMAP to determine the block
number where it will attempt to ***write*** into a data block of a
mounted file system.
I seem to recall either Dave or Christoph ranting at one point that
any program which attempted to write into a mounted file system using
the output of FIEMAP was insane, and we should not be encouraging that
kind of wacko behavior. :-)
What most bootloaders want is simply the accurate list of block
locations so they can write that into the stage 1 bootloader so it can
read the stage 2 bootloader from the disk. The reason why we have the
JDATA hack in the bmap code is because LILO will get the block
location, and then try to write config information into that block.
So we are trying to prevent LILO's write of the boot command line from
possibly getting rewritten after a journal replay. (Of course, no
distribution installer would do something as rude as to just forcibly
rebooting the system without a clean unmount, so this would *never* be
a problem, RIGHT? :-)
In any case, I'd much rather try to get LILO fixed to do something
sane, rather that move that heavy-ugly JDATA code into FIEMAP, where
it might get triggered unnecessarily by 99.9% of the users who are
doing something not-insane.
- Ted
Powered by blists - more mailing lists