[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080520123505.GP15035@mit.edu>
Date: Tue, 20 May 2008 08:35:05 -0400
From: Theodore Tso <tytso@....edu>
To: Bas van Schaik <bas@...es.nl>
Cc: linux-ext4@...r.kernel.org
Subject: Re: Reoccurring ext3 errors: attempt to access beyond end of
device, freeing blocks not in datazone
> After running e2fsck everything is fine again for a week or a little
> longer...
Did e2fsck report any errors? You didn't say; but if this kind of
corruption were really on the hard drive itself, it would cause e2fsck
to scream bloody murder.
If e2fsck didn't report any problems, then the problem is certainly on
the read path, and your system is sometimes returning corrupted data
--- probably because in some cases when the system asked for a disk
block #1234, it somehow got disk block #1257, or something like that.
In any case, you should have gotten some errors that inodes were
referring to invalid block numbers; that's what the Linux kernel was
complaining about.
> What I would like to know: what are the possible underlying causes for
> the "attempt to access beyond en of device" error? Does anyone see any
> meaning in the block (?) numbers mentioned in my syslog?
The errors you have here indicate a corrupt indirect block. So when
trying to read from an inode with the corrupted indirect block, it
asked the block I/O layer to read from a block far larger than the
size of the block device.
The errors from the block I/O error are in units of sectors, so you
need to divide them by 8 to get 4k block numbers:
> > May 20 09:13:14 infinity kernel: attempt to access beyond end of device
> > May 20 09:13:14 infinity kernel: loop0: rw=0, want=15629775440, limit=4404019200
So for these, you had
15629775440 / 8 = 0x74736E4A (or in ascii 'Jnst')
13075964688 / 8 = 0x616C6C62 (or in ascii 'blla')
15354014352 / 8 = 0x72657552 (or in ascii 'Ruer')
These errors errors happen when deleting a file with a bogus indirect
block (or garbage in the inode, but that's much less likely and
probably would have triggered additional complaints).
> > May 20 09:15:07 infinity kernel: EXT3-fs error (device loop0): ext3_free_blocks: Freeing blocks not in datazone - block = 1953721929, count = 1
Converting these numbers to hex:
1953721929 = 0x74736E49 (or in ascii 'Jnst')
1634495585 = 0x616C6C61 (or in ascii 'alla'
543517044 = 0x20656974 (or in ascii 'tie ')
1919251793 = 0x72657551 (or in ascii 'Quer')
Given that it's all ascii, it looks like the indirect block somehow
was overwritten, or was substituted by text.
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists