[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101108011919.GN13830@dastard>
Date: Mon, 8 Nov 2010 12:19:19 +1100
From: Dave Chinner <david@...morbit.com>
To: Ted Ts'o <tytso@....edu>, linux-kernel@...r.kernel.org,
linux-ext4@...r.kernel.org
Subject: Re: [PATCH -BUGFIX] Re: BUG in ext4 with 2.6.37-rc1
On Sun, Nov 07, 2010 at 04:21:43PM -0500, Ted Ts'o wrote:
> On Wed, Nov 03, 2010 at 04:53:06PM -0400, Nick Bowler wrote:
> >
> > OK, it's 100% reproducible: the kernel BUGs, without fail, every time I
> > do 'make install' in the gcc build tree. After applying the patch, it
> > seems that the original BUG is gone, but now there's a new one:
>
> OK, I think I have a new version of the patch that should fix both the
> original and the second BUG_ON which you reported. Unfortunately,
> I've not been able to reproduce the BUG_ON, even by downloading gcc
> 4.5.1, configuring to build just gcc, and then doing a "make install".
> I'm not sure what languages you had enabled, or how much memory you
> have on your machine, etc., all of which might have made it harder for
> me to repro the bug. Still, I think I now understand what's causing
> it.
>
> Can you give this a try and let me know if this fixes things for you?
>
> - Ted
>
> From 94acf883b316f50b38018f710341a943cdd38d0d Mon Sep 17 00:00:00 2001
> From: Theodore Ts'o <tytso@....edu>
> Date: Sun, 7 Nov 2010 16:11:42 -0500
> Subject: [PATCH -v2] ext4: handle writeback of inodes which are being freed
>
> The following BUG can occur when an inode which is getting freed when
> it still has dirty pages outstanding, and it gets deleted (in this
> because it was the target of a rename). In ordered mode, we need to
> make sure the data pages are written just in case we crash before the
> rename (or unlink) is committed. If the inode is being freed then
> when we try to igrab the inode, we end up tripping the BUG_ON at
> fs/ext4/page-io.c:146.
>
> In this case, don't need to do any of the endio handling, so we can
> drop the BUG_ON and then arrange to make ext4_bio_end() handle this
> case appropriately by releasing the pages and ending the writeback on
> those pages, and then returning early without queueing the io_end
> structure on the workqueue.
>
> kernel BUG at /scratch_space/linux-2.6/fs/ext4/page-io.c:146!
> Call Trace:
> [<ffffffff811075b1>] ext4_bio_write_page+0x172/0x307
> [<ffffffff811033a7>] mpage_da_submit_io+0x2f9/0x37b
> [<ffffffff811068d7>] mpage_da_map_and_submit+0x2cc/0x2e2
> [<ffffffff811069b3>] mpage_add_bh_to_extent+0xc6/0xd5
> [<ffffffff81106c66>] write_cache_pages_da+0x2a4/0x3ac
> [<ffffffff81107044>] ext4_da_writepages+0x2d6/0x44d
> [<ffffffff81087910>] do_writepages+0x1c/0x25
> [<ffffffff810810a4>] __filemap_fdatawrite_range+0x4b/0x4d
> [<ffffffff810815f5>] filemap_fdatawrite_range+0xe/0x10
> [<ffffffff81122a2e>] jbd2_journal_begin_ordered_truncate+0x7b/0xa2
> [<ffffffff8110615d>] ext4_evict_inode+0x57/0x24c
> [<ffffffff810c14a3>] evict+0x22/0x92
> [<ffffffff810c1a3d>] iput+0x212/0x249
> [<ffffffff810bdf16>] dentry_iput+0xa1/0xb9
> [<ffffffff810bdf6b>] d_kill+0x3d/0x5d
> [<ffffffff810be613>] dput+0x13a/0x147
> [<ffffffff810b990d>] sys_renameat+0x1b5/0x258
> [<ffffffff81145f71>] ? _atomic_dec_and_lock+0x2d/0x4c
> [<ffffffff810b2950>] ? cp_new_stat+0xde/0xea
> [<ffffffff810b29c1>] ? sys_newlstat+0x2d/0x38
> [<ffffffff810b99c6>] sys_rename+0x16/0x18
> [<ffffffff81002a2b>] system_call_fastpath+0x16/0x1b
>
> Reported-by: Nick Bowler <nbowler@...iptictech.com>
> Signed-off-by: "Theodore Ts'o" <tytso@....edu>
> ---
> fs/ext4/page-io.c | 45 ++++++++++++++++++++++-----------------------
> 1 files changed, 22 insertions(+), 23 deletions(-)
>
> diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
> index 46a7d6a..94bbdb4 100644
> --- a/fs/ext4/page-io.c
> +++ b/fs/ext4/page-io.c
> @@ -143,7 +143,6 @@ ext4_io_end_t *ext4_init_io_end(struct inode *inode, gfp_t flags)
> if (io) {
> memset(io, 0, sizeof(*io));
> io->inode = igrab(inode);
> - BUG_ON(!io->inode);
> INIT_WORK(&io->work, ext4_end_io_work);
> INIT_LIST_HEAD(&io->list);
> }
> @@ -171,35 +170,15 @@ static void ext4_end_bio(struct bio *bio, int error)
> struct workqueue_struct *wq;
> struct inode *inode;
> unsigned long flags;
> - ext4_fsblk_t err_block;
> int i;
>
> BUG_ON(!io_end);
> - inode = io_end->inode;
> bio->bi_private = NULL;
> bio->bi_end_io = NULL;
> if (test_bit(BIO_UPTODATE, &bio->bi_flags))
> error = 0;
> - err_block = bio->bi_sector >> (inode->i_blkbits - 9);
> bio_put(bio);
>
> - if (!(inode->i_sb->s_flags & MS_ACTIVE)) {
> - pr_err("sb umounted, discard end_io request for inode %lu\n",
> - io_end->inode->i_ino);
> - ext4_free_io_end(io_end);
> - return;
> - }
> -
> - if (error) {
> - io_end->flag |= EXT4_IO_END_ERROR;
> - ext4_warning(inode->i_sb, "I/O error writing to inode %lu "
> - "(offset %llu size %ld starting block %llu)",
> - inode->i_ino,
> - (unsigned long long) io_end->offset,
> - (long) io_end->size,
> - (unsigned long long) err_block);
> - }
> -
> for (i = 0; i < io_end->num_io_pages; i++) {
> struct page *page = io_end->pages[i]->p_page;
> struct buffer_head *bh, *head;
> @@ -254,9 +233,30 @@ static void ext4_end_bio(struct bio *bio, int error)
> if (!partial_write)
> SetPageUptodate(page);
> }
> -
> io_end->num_io_pages = 0;
>
> + if ((inode = io_end->inode) == NULL)
> + goto no_work;
> +
> + if (!(inode->i_sb->s_flags & MS_ACTIVE)) {
> + pr_err("sb umounted, discard end_io request for inode %lu\n",
> + io_end->inode->i_ino);
> + no_work:
> + ext4_free_io_end(io_end);
> + return;
> + }
Doesn't this mean you are tossing away unwritten extent conversion
processing when IO is issued on any inode in the I_WILL_FREE/I_FREEING
state, or completing IO after the unmount is in progress?
Cheers,
Dave.
> +
> + if (error) {
> + io_end->flag |= EXT4_IO_END_ERROR;
> + ext4_warning(inode->i_sb, "I/O error writing to inode %lu "
> + "(offset %llu size %ld starting block %llu)",
> + inode->i_ino,
> + (unsigned long long) io_end->offset,
> + (long) io_end->size,
> + (unsigned long long)
> + bio->bi_sector >> (inode->i_blkbits - 9));
> + }
> +
> /* Add the io_end to per-inode completed io list*/
> spin_lock_irqsave(&EXT4_I(inode)->i_completed_io_lock, flags);
> list_add_tail(&io_end->list, &EXT4_I(inode)->i_completed_io_list);
> @@ -305,7 +305,6 @@ static int io_submit_init(struct ext4_io_submit *io,
> bio->bi_private = io->io_end = io_end;
> bio->bi_end_io = ext4_end_bio;
>
> - io_end->inode = inode;
> io_end->offset = (page->index << PAGE_CACHE_SHIFT) + bh_offset(bh);
>
> io->io_bio = bio;
> --
> 1.7.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists