[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110905125939.GF5466@quack.suse.cz>
Date: Mon, 5 Sep 2011 14:59:40 +0200
From: Jan Kara <jack@...e.cz>
To: Thilo-Alexander Ginkel <thilo@...kel.com>
Cc: linux-kernel@...r.kernel.org, linux-ext4@...r.kernel.org
Subject: Re: [bug] ext{3,4}: __find_get_block_slow() failed on 3.0.3
Hi,
On Sat 20-08-11 01:51:49, Thilo-Alexander Ginkel wrote:
> while rsyncing a large amount (> 1TB) of data from an ext3 to an ext4
> on my machine [1], I encountered an issue where rsync and syslog
> eventually started consuming 100% CPU and my syslog was flooded [2]
> with error messages:
>
> -- 8< --
> > kernel: [101543.047293] b_state=0x00000029, b_size=>[10ock01543.04>[101543.047321] __find_get_block_slow() failed. block=328204473, b_blocknr=51867812025
> > kernel: [101543.047330] b_state=0x00000029, b_size=4096
> > kernel: [101543.047>[10ock01543.047348] b_state=0x00000029, b_size=4096
> > kernel: [101543.047353] device blocksize: 4096
> > kernel: [101543.047359] __find_get_block_slow() failed. block=328204473, b01543.0>[10ock01543.047>[1ock01543.047404] b_state=0x00000029, b_size=4096
> > kernel: [101543.047409] device blocksize: 4096
> > kernel: [101543.047414] __find_get_block_slow() failed. block=328204473, b_blocknr=51867812025
> > kernel: [10154ock01543.0>[1ock01543.0492>[1ock01543.0492>[1ock01543.049>[1ock01543.0492>[1ock01543.0>[1ock01543.049>[1ock01543.049>[1ock01543.0492>[10ock01543.0>[1ock=01543.04>[1ock01543.>[1ock01543.0493>[1ock01543.049>[1ock01543.04>[1ock01543.0493>[1ock01543.04941>[1ock01543.0494>[1ock01543.0>[1ock01543.049>[10ock01543.0>[1ock01543.04>[1ock01543.04>[1ock01543.0495>[1ock01543.0495>[1ock01543.0495>[1ock01543.0496>[1ock01543.04>[1ock01543.04>[1ock01543.049>[1ock01543.049>[1ock01543.04>[1ock01543.0497>[1ock01543.0>[1ock01543.0497>[1ock01543.0497>[1ock01543.0498>[1ock01543.0498>[1ock01543.04>[1ock01543.04>[1ock01543.0498>[1ock01543.0498>[1ock01543.0499>[1ock01543.0499>[1ock01543.04>[101543.049967] __find_get_block_slow() failed. block=328204473, b_blocknr=51867812025
> > kernel: [101543.049975] b_state=0x00000029, b_size=4096
> > kernel: [101543.049980] device blocksize: 4096
> > kernel: [101543.049986] __find_get_block_slow() failed. block=328204473, b_blocknr=51867812025
> -- 8< --
>
> These are not preceded by any other error messages (about possible FS
> inconsistencies) as has been the case in the past when bugs related to
> this error message were reported.
>
> Judging by the block size, the possibly corrupt volume is the ext3 one
> (the ext4 volume has a block size of 2048).
>
> A forced fsck.ext{3,4} of the source and target partitions did not
> show any inconsistencies.
>
> Any ideas?
Something has corrupted your buffer head structure in memory (and we then
infinitely looped in __getblk_slow()). bh->b_blocknr has been 0xC139000B9
which it should have been 0x139000B9 (5th byte has been changed from 0x00
to 0x0C). It might be a hw fault, buggy driver, or some other bug - hard to
say. You might want to run memtest for some time, or enable some kernel debug
options (DEBUG_PAGEALLOC, DEBUG_SLAB) which might catch the code causing
corruption (this assumes it's at least occasionally reproducible and your
are willing to take the performance hit)...
Honza
--
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists