[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140106022128.GA22936@localhost>
Date: Mon, 6 Jan 2014 10:21:28 +0800
From: Fengguang Wu <fengguang.wu@...el.com>
To: Muthu Kumar <muthu.lkml@...il.com>
Cc: Kent Overstreet <kmo@...erainc.com>, Jens Axboe <axboe@...nel.dk>,
linux-btrfs <linux-btrfs@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ux.intel.com
Subject: Re: [block:for-3.14/core] kernel BUG at fs/bio.c:1748
On Sun, Jan 05, 2014 at 08:28:57AM -0800, Muthu Kumar wrote:
> Fengguang,
> Instead of rebooting, can you trigger a crash dump when this happens
> and send us the backtrace (to start with)?
Muthu, good point! Attached is the full dmesg with backtrace:
[ 1398.988324] SysRq : Show Blocked State
[ 1398.992007] task PC stack pid father
[ 1398.992007] mount D 0000000000000002 0 2875 2870 0x00000000
[ 1398.992007] ffff88007f859a70 0000000000000082 ffff88007f859fd8 ffff8803d21c6c10
[ 1398.992007] 0000000000012fc0 ffff8803d21c6c10 0000000000000000 0000000000000000
[ 1398.992007] ffff8803d2d22068 0000000000000008 ffff88007f859a18 ffffffff814c2b62
[ 1398.992007] Call Trace:
[ 1398.992007] [<ffffffff814c2b62>] ? submit_bio+0x106/0x159
[ 1398.992007] [<ffffffff81431c6a>] ? __do_readpage+0x4b9/0x50e
[ 1398.992007] [<ffffffff81064a03>] ? kvm_clock_read+0x27/0x31
[ 1398.992007] [<ffffffff81064a16>] ? kvm_clock_get_cycles+0x9/0xb
[ 1398.992007] [<ffffffff811651a1>] ? filemap_fdatawait+0x23/0x23
[ 1398.992007] [<ffffffff819ff356>] schedule+0x6f/0x71
[ 1398.992007] [<ffffffff819ff59b>] io_schedule+0x8f/0xd6
[ 1398.992007] [<ffffffff811651af>] sleep_on_page+0xe/0x12
[ 1398.992007] [<ffffffff819ff861>] __wait_on_bit+0x48/0x7b
[ 1398.992007] [<ffffffff81165002>] wait_on_page_bit+0x7a/0x7c
[ 1398.992007] [<ffffffff810f7ee3>] ? autoremove_wake_function+0x34/0x34
[ 1398.992007] [<ffffffff81433eee>] read_extent_buffer_pages+0x1ae/0x23b
[ 1398.992007] [<ffffffff81410da7>] ? free_root_pointers+0x5b/0x5b
[ 1398.992007] [<ffffffff814123e5>] btree_read_extent_buffer_pages.constprop.48+0x66/0x100
[ 1398.992007] [<ffffffff814129d1>] read_tree_block+0x2f/0x47
[ 1398.992007] [<ffffffff814163e6>] open_ctree+0x1271/0x1adf
[ 1398.992007] [<ffffffff813f4243>] btrfs_mount+0x47b/0x771
[ 1398.992007] [<ffffffff814e1f8c>] ? get_from_free_list+0x41/0x4b
[ 1398.992007] [<ffffffff811c40bf>] mount_fs+0x15/0xae
[ 1398.992007] [<ffffffff811d9a52>] vfs_kern_mount+0x64/0xf6
[ 1398.992007] [<ffffffff811dbff6>] do_mount+0x781/0x878
[ 1398.992007] [<ffffffff8117d6c2>] ? strndup_user+0x3a/0xd6
[ 1398.992007] [<ffffffff811dc317>] SyS_mount+0x85/0xbe
[ 1398.992007] [<ffffffff81a09529>] system_call_fastpath+0x16/0x1b
[ 1398.992007] Sched Debug Version: v0.11, 3.13.0-rc6-00148-gc05f7ce #1
> Kent,
> Did you do any btrfs test with your changes?
Just try simple dd writes.
Thanks,
Fengguang
> Regards,
> Muthu
>
> On Sun, Jan 5, 2014 at 1:46 AM, Fengguang Wu <fengguang.wu@...el.com> wrote:
> > Hi Muthu,
> >
> > On Fri, Jan 03, 2014 at 11:51:31AM -0800, Muthu Kumar wrote:
> >> Looks like Kent missed the btrfs endio in the original commit. How
> >> about this patch:
> >>
> >> ---------
> >>
> >> In btrfs_end_bio, call bio_endio_nodec on the restored bio so the
> >> bi_remaining is accounted for correctly.
> >>
> >> Reported-by: fengguang.wu@...el.com
> >> Cc: Kent Overstreet <kmo@...erainc.com>
> >> CC: Jens Axboe <axboe@...nel.dk>
> >> Signed-off-by: Muthukumar Ratty <muthur@...il.com>
> >> --------
> >>
> >> fs/btrfs/volumes.c | 6 +++++-
> >> 1 files changed, 5 insertions(+), 1 deletions(-)
> >>
> >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> >> index f2130de..edfed52 100644
> >> --- a/fs/btrfs/volumes.c
> >> +++ b/fs/btrfs/volumes.c
> >> @@ -5316,7 +5316,11 @@ static void btrfs_end_bio(struct bio *bio, int err)
> >> }
> >> kfree(bbio);
> >>
> >> - bio_endio(bio, err);
> >> + /*
> >> + * Call endio_nodec on the restored bio so the bi_remaining is
> >> + * accounted for correctly
> >> + */
> >> + bio_endio_nodec(bio, err);
> >> } else if (!is_orig_bio) {
> >> bio_put(bio);
> >> }
> >
> > Interestingly, the BUG message disappeared but it blocks the test run.
> > In the end, the test watchdog reboots the machine with SysRq:
> >
> > 2014-01-04 23:13:02 mount -t btrfs /dev/vda /fs/vda
> > [ 20.184264] btrfs: device fsid f0e06999-0518-47e0-a622-21b8749438be devid 1 transid 4 /dev/vda
> > [ 20.186552] btrfs: disk space caching is enabled
> > [ 131.360457] random: nonblocking pool is initialized
> > ==> [ 1465.069342] SysRq : Emergency Sync
> > ==> [ 1475.071055] SysRq : Resetting
> >
> > Attached is the full dmesg for a good run (v3.13-rc7) and a bad run
> > (this patch).
> >
> > Thanks,
> > Fengguang
View attachment "dmesg-bio_endio_nodec-w" of type "text/plain" (95352 bytes)
Powered by blists - more mailing lists