[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <AB177EF8-199A-4A6D-A08F-2A6951F5F100@dilger.ca>
Date: Tue, 19 Nov 2013 09:46:40 -0700
From: Andreas Dilger <adilger@...ger.ca>
To: Stephen Elliott <techweb@...world.com>
Cc: Zheng Liu <gnehzuil.liu@...il.com>,
David Jeffery <djeffery@...hat.com>,
"<linux-ext4@...r.kernel.org>" <linux-ext4@...r.kernel.org>,
Bernd Schubert <bernd.schubert@...m.fraunhofer.de>,
Eric Whitney <enwlinux@...il.com>
Subject: Re: Query FSCK Errors on ext4
As previously written in earlier comments, the bug is likely in the ext4 code of your appliance, and could possibly be fixed by the patch that was pointed our at that time.
If you ask for help, you actually need to read the replies that are given.
Cheers, Andreas
On 2013-11-19, at 5:44, "Stephen Elliott" <techweb@...world.com> wrote:
> Hi Guys,
>
> Did you have any further feedback on this? It is purely curiosity for me:
>
> I have theorised that the problem comes from the MS access DB being open
> (over Samba) on client workstations when the server is reloaded.
>
> Since ensuring these are closed prior to reloading, I have not seen further
> FSCK errors on reload. Is there an explanation for this? I can see why this
> may corrupt DB but not the filesystem.
>
> Many Thanks
> Stephen Elliott
>
> -----Original Message-----
> From: Stephen Elliott [mailto:techweb@...world.com]
> Sent: 28 October 2013 21:18
> To: 'Andreas Dilger'
> Cc: 'Zheng Liu'; 'David Jeffery'; 'linux-ext4@...r.kernel.org List'; 'Bernd
> Schubert'; 'Eric Whitney'
> Subject: RE: Query FSCK Errors on ext4
>
> Ultimately I am not too worried about this problem (now I know the cause)
> but I am intrigued to know what actually caused the issue in the first
> place. As you can see there is some history around the problem.
>
> Also was that defect / bug actually confirmed?
>
> -----Original Message-----
> From: Andreas Dilger [mailto:adilger@...ger.ca]
> Sent: 28 October 2013 20:54
> To: Stephen Elliott
> Cc: Zheng Liu; David Jeffery; linux-ext4@...r.kernel.org List; Bernd
> Schubert; Eric Whitney
> Subject: Re: Query FSCK Errors on ext4
>
> On Oct 28, 2013, at 3:00 AM, Stephen Elliott <techweb@...world.com> wrote:
>> Thanks for the reply guys...
>>
>> The device in question is a ReadyNAS Pro 6, which happens to be running
> Linux :) I actually saw some issues with e2fsck 1.42.3 earlier this year:
>
> So it looks like your next course of action is to contact ReadyNAS to see if
> they have the patch that Zheng mentioned below in their kernel.
>
> Cheers, Andreas
>
>> ***** File system check forced at Fri Apr 26 20:08:38 WEST 2013 *****
>> fsck 1.41.14 (22-Dec-2010) e2fsck 1.42.3 (14-May-2012) Pass 1:
>> Checking inodes, blocks, and sizes Inode 4195619, i_blocks is 3135728,
>> should be 3135904. Fix? yes
>>
>> Running additional passes to resolve blocks claimed by more than one
> inode...
>> Pass 1B: Rescanning for multiply-claimed blocks Multiply-claimed
>> block(s) in inode 4195619: 167904376 167904377 167904378 167904379
>> 167904380 167904381 167904382 167904383 167904384 167904385 167904386
>> 167949296 167949297 167949298 167949299 167949300 167949301 167949302
>> 167949303 167949304 167949305 167949306 Pass 1C: Scanning directories
>> for inodes with multiply-claimed blocks Pass 1D: Reconciling
>> multiply-claimed blocks (There are 1 inodes containing
>> multiply-claimed blocks.)
>>
>> File /PREMIER/Premier Automation Purchase OrdersApp V18.5.mdb (inode
>> #4195619, mod time Fri Apr 26 20:07:42 2013) has 22 multiply-claimed
> block(s), shared with 0 file(s):
>> Multiply-claimed blocks already reassigned or cloned.
>>
>> Pass 2: Checking directory structure
>> Pass 3: Checking directory connectivity Pass 4: Checking reference
>> counts Pass 5: Checking group summary information
>>
>> /dev/c/c: ***** FILE SYSTEM WAS MODIFIED *****
>> /dev/c/c: 615898/30212096 files (13.6% non-contiguous),
>> 62353456/483393536 blocks
>>
>> After deleting the file (MS Access DB, and re-creating from backup,
>> the file system got mounted read only and the following errors were
>> logged:]
>>
>> May 8 14:58:15 despair kernel: EXT4-fs error (device dm-0:
>> mb_free_blocks:1411: group 5124block 167904376:freeing already freed block
> (bit 1144 May 8 14:58:15 despair kernel: Aborting journal on device dm-0-8.
>> May 8 14:58:15 despair kernel: EXT4-fs (dm-0: Remounting filesystem
>> read-only May 8 14:58:15 despair kernel: EXT4-fs error (device dm-0:
>> mb_free_blocks:1411: group 5124block 167904377:freeing already freed
>> block (bit 1145 May 8 14:58:15 despair kernel: EXT4-fs error (device
>> dm-0: mb_free_blocks:1411: group 5124block 167904378:freeing already
>> freed block (bit 1146 May 8 14:58:15 despair kernel: EXT4-fs error
>> (device dm-0: mb_free_blocks:1411: group 5124block 167904379:freeing
>> already freed block (bit 1147 May 8 14:58:15 despair kernel: EXT4-fs
>> error (device dm-0: mb_free_blocks:1411: group 5124block
>> 167904380:freeing already freed block (bit 1148 May 8 14:58:15 despair
>> kernel: EXT4-fs error (device dm-0: mb_free_blocks:1411: group
>> 5124block 167904381:freeing already freed block (bit 1149 May 8
>> 14:58:15 despair kernel: EXT4-fs error (device dm-0:
>> mb_free_blocks:1411: group 5124block 167904382:freeing already freed
>> block (bit 1150 May 8 14:58:16 despair kernel: EXT4-fs error (device
>> dm-0: mb_free_blocks:1411: group 5124block 167904383:freeing already
>> freed block (bit 1151 May 8 14:58:16 despair kernel: EXT4-fs error
>> (device dm-0: mb_free_blocks:1411: group 5124block 167904384:freeing
>> already freed block (bit 1152 May 8 14:58:16 despair kernel: EXT4-fs
>> error (device dm-0: mb_free_blocks:1411: group 5124block
>> 167904385:freeing already freed block (bit 1153 May 8 14:58:16 despair
>> kernel: EXT4-fs error (device dm-0: mb_free_blocks:1411: group
>> 5124block 167904386:freeing already freed block (bit 1154 May 8
>> 14:58:16 despair kernel: EXT4-fs error (device dm-0:
>> mb_free_blocks:1411: group 5125block 167949296:freeing already freed
>> block (bit 13296 May 8 14:58:16 despair kernel: EXT4-fs error (device
>> dm-0: mb_free_blocks:1411: group 5125block 167949297:freeing already
>> freed block (bit 13297 May 8 14:58:16 despair kernel: EXT4-fs error
>> (device dm-0: mb_free_blocks:1411: group 5125block 167949298:freeing
>> already freed block (bit 13298 May 8 14:58:16 despair kernel: EXT4-fs
>> error (device dm-0: mb_free_blocks:1411: group 5125block
>> 167949299:freeing already freed block (bit 13299 May 8 14:58:17
>> despair kernel: EXT4-fs error (device dm-0: mb_free_blocks:1411: group
>> 5125block 167949300:freeing already freed block (bit 13300 May 8
>> 14:58:17 despair kernel: EXT4-fs error (device dm-0:
>> mb_free_blocks:1411: group 5125block 167949301:freeing already freed
>> block (bit 13301 May 8 14:58:17 despair kernel: EXT4-fs error (device
>> dm-0: mb_free_blocks:1411: group 5125block 167949302:freeing already
>> freed block (bit 13302 May 8 14:58:17 despair kernel: EXT4-fs error
>> (device dm-0: mb_free_blocks:1411: group 5125block 167949303:freeing
>> already freed block (bit 13303 May 8 14:58:17 despair kernel: EXT4-fs
>> error (device dm-0: mb_free_blocks:1411: group 5125block
>> 167949304:freeing already freed block (bit 13304 May 8 14:58:17
>> despair kernel: EXT4-fs error (device dm-0: mb_free_blocks:1411: group
>> 5125block 167949305:freeing already freed block (bit 13305 May 8
>> 14:58:17 despair kernel: EXT4-fs error (device dm-0:
>> mb_free_blocks:1411: group 5125block 167949306:freeing already freed
>> block (bit 13306
>>
>>
>> These are the same blocks slated as multiply claimed
>>
>> And then running an FSCK, we got the following:
>>
>> ***** File system check forced at Wed May 8 15:16:50 WEST 2013 *****
>> fsck 1.41.14 (22-Dec-2010 e2fsck 1.42.3 (14-May-2012
>> /dev/c/c: recovering journal
>> Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory
> structure Pass 3: Checking directory connectivity Pass 4: Checking reference
> counts Pass 5: Checking group summary information Free blocks count wrong
> for group #5124 (28170, counted=28159.
>> Fix? yes
>>
>> Free blocks count wrong for group #5125 (25861, counted=25850.
>> Fix? yes
>>
>> Free blocks count wrong (420683133, counted=420644972.
>> Fix? yes
>>
>> Free inodes count wrong (29595347, counted=29595271.
>> Fix? yes
>>
>>
>> /dev/c/c: ***** FILE SYSTEM WAS MODIFIED *****
>> /dev/c/c: 616825/30212096 files (13.6% non-contiguous,
>> 62748564/483393536 blocks
>>
>> Then later in the year I reloaded the server with the database open
>> from several client machines
>>
>> ***** File system check forced at Tue Jul 23 21:02:13 WEST 2013 ***** fsck
> 1.42.8 (20-Jun-2013) e2fsck 1.42.8 (20-Jun-2013) Pass 1: Checking inodes,
> blocks, and sizes Inode 4195619, end of extent exceeds allowed value
>> (logical block 64907, physical block 11435403, len 16)
>> Clear? yes
>>
>> Inode 4195619, i_blocks is 1337216, should be 1337176. Fix? yes
>>
>> Pass 2: Checking directory structure
>> Pass 3: Checking directory connectivity Pass 4: Checking reference
>> counts Pass 5: Checking group summary information Block bitmap
>> differences: -(11435403--11435407) Fix? yes
>>
>> Free blocks count wrong for group #348 (2130, counted=2135).
>> Fix? yes
>>
>> Free blocks count wrong (417470107, counted=417470112).
>> Fix? yes
>>
>>
>> /dev/c/c: ***** FILE SYSTEM WAS MODIFIED *****
>> /dev/c/c: 625785/30212096 files (13.6% non-contiguous),
>> 65923424/483393536 blocks
>>
>> Again related to the same file, which is only an MS Access DB open from
> several client machines over SMB when the server is rebooted. Moving forward
> I ensure all instances are closed when reloading but even so I am surprised
> that a clean reload causes corruption at the filesystem level.
>>
>> Since ensuring the DB is closed before reload, I have seen no further
> issues like this.
>>
>> Many Thanks
>> Stephen Elliott
>>
>> -----Original Message-----
>> From: Zheng Liu [mailto:gnehzuil.liu@...il.com]
>> Sent: 28 October 2013 06:39
>> To: Andreas Dilger
>> Cc: Stephen Elliott; David Jeffery; linux-ext4@...r.kernel.org List;
>> Bernd Schubert; Eric Whitney
>> Subject: Re: Query FSCK Errors on ext4
>>
>> [Cc Eric Whitney to confirm this problem]
>>
>> Hi Andreas,
>>
>> If I remember correctly, this patch might can fix this problem [1].
>>
>> 1. http://www.spinics.net/lists/linux-ext4/msg39485.html
>>
>> Regards,
>> - Zheng
>>
>> On Mon, Oct 28, 2013 at 12:13:26AM -0600, Andreas Dilger wrote:
>>> The error reported here is a relatively new one. It only appeared in
>>> e2fsck 1.42.8, and wasn t in the code that I m using locally (1.42.7)
>>> so I wasn t sure what it actually meant without looking at it.
>>>
>>> It looks like some kind of overflow of the extent tree, which causes
>>> e2fsck to chop off the last 5 disk blocks (40 sectors), though I m
>>> not sure exactly why. From your comments, this can be reproduced
>>> with your database usage? Does it use fallocate() or any other
>>> strange IO operations that might be causing this?
>>>
>>> Have you tried updating your kernel? If there is repeated corruption
>>> appearing in the filesystem, then it is either a bug in the kernel or
>>> in e2fsck. Not really sure which one to blame at this point.
>>>
>>> Cheers, Andreas
>>>
>>> On Oct 18, 2013, at 9:45 AM, Stephen Elliott <techweb@...world.com>
> wrote:
>>>
>>>> Any feedback on this guys??? Would really appreciate somebody taking a
> look over this.
>>>>
>>>> From: Stephen Elliott [mailto:techweb@...world.com]
>>>> Sent: 22 September 2013 20:13
>>>> To: linux-ext4@...r.kernel.org; linux-fsdevel@...r.kernel.org; Andreas
> Dilger (adilger@...ger.ca); 'Bernd Schubert'
>>>> Subject: Query FSCK Errors on ext4
>>>>
>>>> Hi all,
>>>>
>>>> I have theorised that the problem comes from the MS access DB being open
> (over Samba) on client workstations when the server is reloaded.
>>>>
>>>> Since ensuring these are closed prior to reloading, I have not seen
> further FSCK errors on reload. Is there an explanation for this? I can see
> why this may corrupt DB but not the filesystem.
>>>>
>>>> Just as a primer, I used a ReadyNAS NV+ for many years which was running
> ext3 and never had this issue. However, since using ext4 on a ReadyNAS Pro,
> I now see this issue.
>>>>
>>>> Many Thanks
>>>> Stephen Elliott
>>>>
>>>> From: Stephen Elliott [mailto:techweb@...world.com]
>>>> Sent: 23 July 2013 22:02
>>>> To: linux-ext4@...r.kernel.org; linux-fsdevel@...r.kernel.org; Andreas
> Dilger (adilger@...ger.ca); 'Bernd Schubert'
>>>> Subject: RE: FSCK Errors on ext4
>>>>
>>>> If it helps guys, the same file as before is causing the issue with
> inode 4195610, a very large MS access DB.
>>>>
>>>> From: Stephen Elliott [mailto:techweb@...world.com]
>>>> Sent: 23 July 2013 21:52
>>>> To: linux-ext4@...r.kernel.org; linux-fsdevel@...r.kernel.org; Andreas
> Dilger (adilger@...ger.ca); 'Bernd Schubert'
>>>> Subject: FSCK Errors on ext4
>>>>
>>>> Hi Andreas / Bernd / all,
>>>>
>>>> You may recall advising me on another batch of FSCK errors a few months
> back.
>>>>
>>>> The same device on an ext4 file system has produced the following errors
> after a clean reload. It seems to be fine now but wanted your input on this.
> No bad blocks are reported on the devices etc.
>>>>
>>>> ***** File system check forced at Tue Jul 23 21:02:13 WEST 2013 *****
> fsck 1.42.8 (20-Jun-2013) e2fsck 1.42.8 (20-Jun-2013) Pass 1: Checking
> inodes, blocks, and sizes Inode 4195619, end of extent exceeds allowed value
>>>> (logical block 64907, physical block 11435403, len
>>>> 16) Clear? yes
>>>>
>>>> Inode 4195619, i_blocks is 1337216, should be 1337176. Fix? yes
>>>>
>>>> Pass 2: Checking directory structure Pass 3: Checking directory
>>>> connectivity Pass 4: Checking reference counts Pass 5: Checking
>>>> group summary information Block bitmap differences:
>>>> -(11435403--11435407) Fix? yes
>>>>
>>>> Free blocks count wrong for group #348 (2130, counted=2135).
>>>> Fix? yes
>>>>
>>>> Free blocks count wrong (417470107, counted=417470112).
>>>> Fix? yes
>>>>
>>>>
>>>> /dev/c/c: ***** FILE SYSTEM WAS MODIFIED *****
>>>> /dev/c/c: 625785/30212096 files (13.6% non-contiguous),
>>>> 65923424/483393536 blocks
>>>>
>>>> Many Thanks
>>>> Stephen Elliott
>>>
>>>
>>> Cheers, Andreas
>>>
>>>
>>>
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4"
>>> in the body of a message to majordomo@...r.kernel.org More majordomo
>>> info at http://vger.kernel.org/majordomo-info.html
>
>
> Cheers, Andreas
>
>
>
>
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists