[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20201002083929.GB17963@quack2.suse.cz>
Date: Fri, 2 Oct 2020 10:39:29 +0200
From: Jan Kara <jack@...e.cz>
To: Mauricio Faria de Oliveira <mfo@...onical.com>
Cc: Jan Kara <jack@...e.cz>, Andreas Dilger <adilger@...ger.ca>,
linux-ext4@...r.kernel.org,
dann frazier <dann.frazier@...onical.com>,
Ted Tso <tytso@....edu>
Subject: Re: [RFC PATCH v4 0/4] ext4/jbd2: data=journal: write-protect pages
on transaction commit
On Thu 01-10-20 09:46:32, Mauricio Faria de Oliveira wrote:
> On Thu, Oct 1, 2020 at 4:34 AM Jan Kara <jack@...e.cz> wrote:
> > On Wed 30-09-20 19:59:44, Mauricio Faria de Oliveira wrote:
> > > 3) Now, the mixed-feelings news.
> > >
> > > The synthetic test-case/patches I had written clearly show that the
> > > patchset works:
> > > - In the original kernel, userspace can write to buffers during
> > > commit; and it moves on.
> > > - In the patched kernel, userspace cannot write to buffers during
> > > commit; it blocks.
> > >
> > > However, the heavy-hammer testing with 'stress-ng --mmap 4xNCPUs --mmap-file'
> > > then crashing the kernel via sysrq-trigger, and trying to mount the
> > > filesystem again,
> > > sometimes still can find invalid checksums, thus journal recovery/mount fails.
> > >
> > > [ 98.194809] JBD2: Invalid checksum recovering data block 109704 in log
> > > [ 98.201853] JBD2: Invalid checksum recovering data block 69959 in log
> > > [ 98.339859] JBD2: recovery failed
> > > [ 98.340581] EXT4-fs (vdc): error loading journal
> > >
> > > So, despite the test exercising mmap() and the patchset being for mmap(),
> > > apparently there is more happening that also needs changes. (Weird; but
> > > I will try to debug that test-case behavior deeper, to find what's going on.)
> > >
> > > This patchset does address a problem, so should we move on with this one,
> > > and as you mentioned, "that would be something for another patch series :)" ?
> >
> > Thanks for the really throughout testing! If you can debug where the
> > problem is still lurking fast, then cool, we can still fix it in this patch
> > series. If not, then I'm fine with just pushing what we have because
> > conceptually that seems like a sane thing to do anyway and we can fix the
> > remaining problem afterwards.
>
> Understood. I'll be able to look at this next week, which should be rc8 [1].
> Would it be good enough, timing wise, to send a non-RFC series with
> what we have (this other issue fixed or not) by the end of next week?
This is more a question for Ted as a maintainer (CCed) but end of next week
is probably too late because Ted needs time to merge the patches in his
tree, run his battery of tests, push changes to linux-next and let them
simmer there for a while before sending them to Linus. So I'd say submit
what you have on Monday / Tuesday and we can always add fixes on top as we
find them.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists