[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070616081426.GB14349@localhost.sw.ru>
Date:	Sat, 16 Jun 2007 12:14:26 +0400
From:	Dmitriy Monakhov <dmonakhov@...ru>
To:	Mingming Cao <cmm@...ibm.com>
Cc:	Alex Tomas <alex@...sterfs.com>, linux-ext4@...r.kernel.org
Subject: Re: delayed allocatiou result in Oops
On 16:16 Птн 15 Июн     , Mingming Cao wrote:
> I hit almost the same issue today also, but with different error #, and
> one more kernel oops, when run fsstress on x86_64. 
> 
> EXT4-fs: writeback error = -2
> EXT4-fs: writeback error = -2
This error never happens in writeback in my case, only ENOSPC.
Btw: i've send one more micro fix (see in this thread " PATCH] ext4:fix 
invariant checking in ext4_rebalance_reservation")
> 
> Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: 
>  [<ffffffff8028bbb6>] block_read_full_page+0xb5/0x267
> PGD 1f9842067 PUD 1f9843067 PMD 0 
> Oops: 0000 [5] SMP 
> CPU 3 
> Modules linked in:
> Pid: 10900, comm: fsstress Not tainted 2.6.22-rc4-autokern1 #1
> RIP: 0010:[<ffffffff8028bbb6>]  [<ffffffff8028bbb6>] block_read_full_page+0xb5/0x267
> RSP: 0000:ffff8101f984fa48  EFLAGS: 00010213
> RAX: 0000000000000179 RBX: 0000000000000000 RCX: 000000000000000c
> RDX: 000000000000000c RSI: ffffffff802e0f7b RDI: ffff81017ff578c8
> RBP: ffff81017ff578c8 R08: ffff8101f984fbe8 R09: ffff8101f984fbe0
> R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000000e5
> R13: 0000000000000000 R14: 0000000000001000 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff8101803ec5c0(0063) knlGS:00000000f7dec460
> CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 00000001f9841000 CR4: 00000000000006e0
> Process fsstress (pid: 10900, threadinfo ffff8101f984e000, task ffff8101f9824280)
> Stack:  00001000f9ad4080 0000000100000000 0000000000000000 0000000000000179
>  ffff8100de7c4100 ffffffff802e0f7b 000003363e3761bf ffffffff804eac2d
>  ffff8101f984fb48 0000000000000082 ffff81017e9bc550 ffff81017e9bc588
> Call Trace:
>  [<ffffffff802e0f7b>] ext4_get_block+0x0/0x104
>  [<ffffffff804eac2d>] thread_return+0x0/0xd5
>  [<ffffffff8028fcd6>] do_mpage_readpage+0x411/0x430
>  [<ffffffff804eb481>] io_schedule+0x26/0x32
>  [<ffffffff804eb6fb>] __wait_on_bit_lock+0x5f/0x6d
>  [<ffffffff8028fe7e>] mpage_readpage+0x42/0x5b
>  [<ffffffff802e0f7b>] ext4_get_block+0x0/0x104
>  [<ffffffff802395eb>] wake_bit_function+0x0/0x23
>  [<ffffffff8024a9bd>] file_read_actor+0x89/0xf4
>  [<ffffffff8024a21e>] find_get_page+0x1e/0x4d
>  [<ffffffff8024a763>] do_generic_mapping_read+0x20e/0x3df
>  [<ffffffff8024a934>] file_read_actor+0x0/0xf4
>  [<ffffffff8024c2e7>] generic_file_aio_read+0x11d/0x154
>  [<ffffffff8026c7ca>] do_sync_read+0xc8/0x10b
>  [<ffffffff80272c4f>] permission+0xbb/0xbd
>  [<ffffffff802395bd>] autoremove_wake_function+0x0/0x2e
>  [<ffffffff8026be62>] nameidata_to_filp+0x25/0x34
>  [<ffffffff8026be9e>] do_filp_open+0x2d/0x3d
>  [<ffffffff8026f355>] vfs_getattr+0x2b/0x2f
>  [<ffffffff8026f43d>] vfs_fstat+0x33/0x3a
>  [<ffffffff8026c8b8>] vfs_read+0xab/0x12e
>  [<ffffffff8026cbbc>] sys_read+0x45/0x6e
>  [<ffffffff80219f02>] ia32_sysret+0x0/0xa
> 
[skip]
> ------------[ cut here ]------------
> kernel BUG at fs/ext4/writeback.c:266!
Yepp. I've saw this oops too, But IMHO it may be artefact caused by
previous one.
> invalid opcode: 0000 [6] SMP 
> CPU 3 
> Modules linked in:
> Pid: 10851, comm: fsstress Not tainted 2.6.22-rc4-autokern1 #1
> RIP: 0010:[<ffffffff802ed5f6>]  [<ffffffff802ed5f6>] ext4_wb_submit_extent+0x1ef/0x3d9
> RSP: 0000:ffff8101e47cfab8  EFLAGS: 00010246
> RAX: 000000000001182c RBX: ffff8100c6709ca0 RCX: 000000000000000c
> RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8101e8de5000
> RBP: ffff8100c6709a48 R08: ffff8101b1056338 R09: 0000000000000000
> R10: ffff8101b1056338 R11: ffff8100c6709a48 R12: 0000000000000040
> R13: ffff81017eaa5b98 R14: 0000000000000040 R15: 0000000000000001
> FS:  0000000000000000(0000) GS:ffff8101803ec5c0(0063) knlGS:00000000f7dec460
> CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
> CR2: 00000000f7dcb004 CR3: 00000001e47c1000 CR4: 00000000000006e0
> Process fsstress (pid: 10851, threadinfo ffff8101e47ce000, task ffff8101e47a6b30)
> Stack:  ffff8101cf22c9b8 0000000000000000 0000000000000001 0000000c00000001
>  ffff8100c6709a48 000000018028938e ffff8101e47cfb68 0000000000000000
>  ffff8101e47cfd28 ffff8100c6709ca0 ffff8100c6709a48 ffff8100c6709990
> Call Trace:
>  [<ffffffff802edb95>] ext4_wb_handle_extent+0x3b5/0x48c
>  [<ffffffff802ebc24>] ext4_ext_walk_space+0x18a/0x20c
>  [<ffffffff802ed7e0>] ext4_wb_handle_extent+0x0/0x48c
>  [<ffffffff802edcc7>] ext4_wb_flush+0x5b/0x153
>  [<ffffffff802ee1a0>] ext4_wb_writepages+0x34b/0x398
>  [<ffffffff8024f81b>] do_writepages+0x20/0x2d
>  [<ffffffff80286164>] __writeback_single_inode+0x1df/0x3a7
>  [<ffffffff8024a47e>] find_get_pages_tag+0x34/0x89
>  [<ffffffff80250c66>] pagevec_lookup_tag+0x1a/0x24
>  [<ffffffff80249e89>] wait_on_page_writeback_range+0xc7/0x10d
>  [<ffffffff80286702>] sync_sb_inodes+0x1cb/0x2a0
>  [<ffffffff8028687c>] sync_inodes_sb+0xa5/0xb9
>  [<ffffffff803b3e09>] __up_read+0x10/0x8a
>  [<ffffffff802868fa>] __sync_inodes+0x6a/0xb1
>  [<ffffffff80286952>] sync_inodes+0x11/0x29
>  [<ffffffff8028895c>] do_sync+0x2c/0x50
>  [<ffffffff8028898b>] sys_sync+0xb/0xf
>  [<ffffffff80219f02>] ia32_sysret+0x0/0xa
> 
> 
> Code: 0f 0b eb fe f0 41 0f ba 75 00 14 48 8b 4c 24 40 01 51 10 48 
> RIP  [<ffffffff802ed5f6>] ext4_wb_submit_extent+0x1ef/0x3d9
>  RSP <ffff8101e47cfab8>
> 
> I will try the patch below...Alex, any hint about the second oops?
> 
> Mingming
> Alex please 
> On Fri, 2007-06-15 at 09:14 +0400, Alex Tomas wrote:
> > looks like an error in error handling path (notice -28 (ENOSPC) before)
> > 
> > thanks for the report, Alex
> > 
> > Dmitriy Monakhov wrote:
> > > )
> > > 
> > > Simple test failed on ext4 when delayed allocation was used.
> > > #mkfs.ext3 -b4096 /dev/vzvg/test2
> > > #mount -text4dev /dev/vzvg/test2  /mnt/test -odelalloc
> > > #fsstress -d /mnt/test/ -l100  -n100000 -p20  -f dwrite=0
> > > 
> > > <CONSOLE LOG>
> > > EXT4-fs: writeback error = -28
> > > ......
> > > EXT4-fs: writeback error = -28
> > > Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: 
> > >  [<ffffffff802a12d2>] block_read_full_page+0xab/0x25f
> > > PGD 44c1067 PUD 44fd067 PMD 0 
> > > Oops: 0000 [2] SMP 
> > > CPU 0 
> > > Modules linked in: ext4dev jbd2
> > > Pid: 4833, comm: fsstress Not tainted 2.6.22-rc4-mm2 #9
> > > RIP: 0010:[<ffffffff802a12d2>]  [<ffffffff802a12d2>] block_read_full_page+0xab/0x25f
> > > RSP: 0018:ffff810004df9a58  EFLAGS: 00010203
> > > RAX: 0000000000001000 RBX: ffff8100cf4256f8 RCX: 000000000000000c
> > > 
> > > RDX: 0000000000000001 RSI: 000000000000000c RDI: ffff8100cf4256f8
> > > RBP: 0000000000000000 R08: ffff810004df9be8 R09: ffff810004df9c58
> > > R10: 8888888888888888 R11: 8888888888888888 R12: 0000000000000052
> > > R13: 0000000000001000 R14: 0000000000000000 R15: 00000000000000d3
> > > FS:  00002adfe3f7d6f0(0000) GS:ffffffff80730000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > CR2: 0000000000000000 CR3: 0000000004362000 CR4: 00000000000006e0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > Process fsstress (pid: 4833, threadinfo ffff810004df8000, task ffff810004c867a0)
> > > Stack:  ffffffff880180f2 ffff8100054a23f0 0000000000000000 0000000000000000
> > >  ffff810005dcbb80 ffff81000549bf00 0000000000000000 ffff810005def8b0
> > >  ffffffff88029e60 ffffffff8025b2be ffff8100054da540 ffff8100cf496fb0
> > > Call Trace:
> > >  [<ffffffff880180f2>] :ext4dev:ext4_get_block+0x0/0x109
> > >  [<ffffffff8025b2be>] find_get_page+0x21/0x51
> > >  [<ffffffff802a5b45>] do_mpage_readpage+0x45f/0x480
> > >  [<ffffffff880180f2>] :ext4dev:ext4_get_block+0x0/0x109
> > >  [<ffffffff88003d64>] :jbd2:jbd2_journal_dirty_metadata+0x197/0x1be
> > >  [<ffffffff80245f3b>] bit_waitqueue+0x1c/0x99
> > >  [<ffffffff802a5bb4>] mpage_readpage+0x4e/0x67
> > >  [<ffffffff880180f2>] :ext4dev:ext4_get_block+0x0/0x109
> > >  [<ffffffff8028817e>] do_lookup+0x63/0x1ae
> > >  [<ffffffff8025b1ae>] file_read_actor+0x8d/0xf6
> > >  [<ffffffff8025b2be>] find_get_page+0x21/0x51
> > >  [<ffffffff8025b93a>] do_generic_mapping_read+0x23c/0x3da
> > >  [<ffffffff8025b121>] file_read_actor+0x0/0xf6
> > >  [<ffffffff8025d123>] generic_file_aio_read+0x119/0x156
> > >  [<ffffffff80281848>] do_sync_read+0xc9/0x10c
> > > 
> > >  [<ffffffff802845b2>] cp_new_stat+0xe5/0xfd
> > >  [<ffffffff80246007>] autoremove_wake_function+0x0/0x2e
> > >  [<ffffffff80281fba>] vfs_read+0xaa/0x132
> > >  [<ffffffff80282356>] sys_read+0x45/0x6e
> > >  [<ffffffff8020b41e>] system_call+0x7e/0x83
> > > Code: 8b 45 00 a8 01 0f 85 e6 00 00 00 8b 45 00 a8 20 0f 85 c9 00 
> > > <CONSOLE LOG>
> > > 
> > > I've digged this a litle bit with folowig results:
> > > 
> > > int block_read_full_page(struct page *page, get_block_t *get_block)
> > > {
> > > ...
> > > 1914:	if (!page_has_buffers(page)) <<< page_has_buffers(page) == true 
> > > 		create_empty_buffers(page, blocksize, 0);
> > > 	head = page_buffers(page); <<<<  page_buffers(page) == NULL  
> > > <<<i've add debug info here:
> > > <<< page->flags == 100000000000821
> > > <<< PagePrivate(page) == 1, (page)->private == NULL
> > > <<< So we have private page without buffers, it is WRONG.
> > > 
> > > 	iblock = (sector_t)page->index << (PAGE_CACHE_SHIFT - inode->i_blkbits);
> > > 	lblock = (i_size_read(inode)+blocksize-1) >> inode->i_blkbits;
> > > 	bh = head;
> > > 	nr = 0;
> > > 	i = 0;
> > > 
> > > 	do {
> > > 		if (buffer_uptodate(bh)) << Null pointer deref here result in oops
> > > .......
> > > }
> > > 
> > > -
> > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > > the body of a message to majordomo@...r.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > 
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@...r.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists
 
