[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160218220956.GA24219@quack.suse.cz>
Date: Thu, 18 Feb 2016 23:09:56 +0100
From: Jan Kara <jack@...e.cz>
To: Theodore Ts'o <tytso@....edu>
Cc: Eric Whitney <enwlinux@...il.com>, linux-ext4@...r.kernel.org,
jack@...e.cz
Subject: Re: [PATCH] ext4: revert i_data_sum locking cleanups for
dioread_nolock
On Tue 16-02-16 00:08:40, Ted Tso wrote:
> On Fri, Feb 12, 2016 at 01:25:06PM -0500, Eric Whitney wrote:
> > Commit 2bcba4781fa3 ("ext4: get rid of EXT4_GET_BLOCKS_NO_LOCK flag")
> > can cause a kernel panic when xfstest generic/300 is run on a file
> > system mounted with dioread_nolock. The panic is typically triggered
> > from check_irqs_on() (fs/buffer.c: 1272), and happens because
> > ext4_end_io_dio() is being called in an interrupt context rather than
> > from a workqueue for AIO. This suggests that buffer_defer_completion
> > may not be set properly when creating an unwritten extent for async
> > direct I/O.
> >
> > Revert the locking changes until this problem can be resolved. Patch
> > applied to 4.5-rc3 and tested with a full xfstest-bld auto group run.
> >
> > Signed-off-by: Eric Whitney <enwlinux@...il.com>
>
> Applied, thanks.
>
> I was able to reliably reproduce the problem without this revert patch
> using a 32-bit x86 kvm-xfstests test appliance:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/kvm-xfstests/testing/root_fs.img.i386
>
> Using the command: "kvm-xfstests -c dioread_nolock generic/300"
>
> With this patch, the problem goes away, so reverting the patch is
> clearly the right thing for now. There is something screwy going on,
> since the original commit shouldn't have made a difference, but let's
> revert it first, and figure it out when we have time to investigate
> more deeply.
OK, I had a look into this. So I'm not 100% what has happened but the
following looks likely: Current io_end handling can overwrite io_end
pointer in the inode in dioread_nolock mode (nothing prevents unlocked DIO
to overwrite pointer of locked DIO and then clear it out). I suspect that
the change in i_data_sem locking made this race more visible. Attached
patch should fix the issue (I don't see failures of generic/300 with it in
dioread_nolock mode). Can you consider this instead of a revert Eric sent?
I have also a more complete rewrite of io_end handling which makes the code
more comprehensible and avoids storing io_end pointer in the inode (thus
avoids similar pitfalls in future) but that is a 4.6 matter. I'll submit
the rewrite once xfstests runs complete.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
View attachment "0001-ext4-Fix-crashes-in-dioread_nolock-mode.patch" of type "text/x-patch" (2610 bytes)
Powered by blists - more mailing lists