lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 25 Sep 2006 14:27:24 +0200
From:	Helge Hafting <helge.hafting@...el.hist.no>
To:	Molle Bestefich <molle.bestefich@...il.com>
CC:	linux-kernel@...r.kernel.org
Subject: Re: ext3 corruption

Molle Bestefich wrote:
> I wrote:
>> I have a ~1TB filesystem that failed to mount today, the message is:
>>
>> EXT3-fs error (device loop0): ext3_check_descriptors: Block bitmap for
>> group 2338 not in group (block 1607003381)!
>> EXT3-fs: group descriptors corrupted !
>>
>> Yesterday it worked flawlessly.
>
> Helge Hafting wrote:
>> > And voila, that difficult task of assessing in which order to do
>> > things is out of the hands of distros like Red Hat, and into the
>> > hands of those people who actually make the binaries.
>>
>> Not so easy.  You do not want to shut down md devices because
>> samba is using them. Someone else may run samba on a single
>> harddisk and also have some md-devices that they take down
>> and bring up a lot.  So having samba generally depend on md doesn't
>> work.  Your setup need it, others may have different needs.
>
> I've looked hard at things and just found that maybe it's not the init
> order that's to blame..
>
> It seems that unmounting the filesystem fails with a "device busy" error.
> I'm not sure why there's still open files on the device, but perhaps a
> remote user is copying a file or some such (likely).
That is solvable by shutting down remote operations first.
So stop samba (or nfs or whatever) before attempting to umount.
> Anyway, the system is shutting down, so it should just forcefully
> unmount the device, but it doesn't.
> The halt script tries "umount" three times, which all fail with:
> "device is busy".
> It then actually tries "umount -f" three times, which all fail with
> "Device or resource busy"
> At which point the halt script turns off the machine and the
> filesystem is ruined.
>
> How to fix forceful unmount so it works?
I don't know, other than researching what filesystems support
forced umount and use one of those. Complain to the vendor or maintainer
of your particular filesystem.

However, you can usually find out why some file is open. Try
umount yourself, when it doesn't work, use "lsof" to see
what file is open. Then figure out who or what is keeping it open.
To debug a shutdown problem, consider putting "lsof >> logfile"
in your shutdown script.

Not a solution but a workaround: Run "sync" before shutdown.
(Stick it in some script.)
Now, all data in filesystem caches will be written to disk before power 
is lost.
This isn't perfect, but filesystem damage is greatly minimized and
often avoided completely.  Useful while waiting for a better solution.

The real solution is to set things up so unforced umount works.
This is normally possible to do.


Helge Hafting
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ