linux-ext4 - Re: maintain badblocks list on the fly

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <52D7AB49.2090907@rempel-privat.de>
Date:	Thu, 16 Jan 2014 10:50:01 +0100
From:	Oleksij Rempel <linux@...pel-privat.de>
To:	Theodore Ts'o <tytso@....edu>
CC:	linux-ext4@...r.kernel.org
Subject: Re: maintain badblocks list on the fly

Am 06.01.2014 02:27, schrieb Theodore Ts'o:
> On Sun, Jan 05, 2014 at 11:02:49AM +0100, Oleksij Rempel wrote:
>>
>> after some googling i didn't found answer to my question, so i set it
>> directly here: do it makes sense and is it possible to maintain bad
>> block list of ext4 on fly? I mean, if ext4 get error from, for example
>> from ata subsystem, and it will mark block as bad or may be better as
>> "probably bad"?
> 
> Figuring out what to do in case of an error is tricky.  Sometimes
> errors are transient.  For example, losing a connection (perhaps
> briefly) to a disk connected via fiber channel).
> 
> Also, with most hard drives, if you rewrite a block which has reported
> a read error, the hard drive will usually remap the block to one of
> the blocks in the spare pool.  So one strategy is when you get a read
> error is not to avoid using the block forever, but to simply write all
> zero's to the block, and then see if the block is now valid.  But now
> combine this with the "some errors are transient" problem --- if you
> do a forced rewrite, you might lose data that you could get back i you
> try rereading the block later.  So it's rare file system author that
> is willing to do an automated forced rewrite when getting a read
> error.
> 
> For a write error, it's safer to try rewriting the block, but most of
> the time the hard drive will have tried rewriting the block already,
> unless it's due to a connection problem between the file system and
> the storage device.  For example, suppose the file system is accessing
> an iSCSI block device which where the transport layer between computer
> and the storage device is a TCP connection...
> 
> So the problem with automated error recovery is that it's highly
> dependent on the storage device (is it a RAID; a hard drive; an iSCSI
> device, etc.) and the application / what are you storing.
> 
> For example, if the file system is on a direct connected HDD as the
> back end for a cluster file system such as hadoopfs or the Google File
> System, where the cluster file system is storing every chunk of its
> file replicated on multiple file servers, and/or using some kind of
> Reed Solomon encoding, when you detect a read error on data block, the
> best thing to do might be to delete file (relying on the fact that the
> next time you write to the bad block, the HDD will remap the block to
> one of the blocks in the spare pool), and then informing the cluster
> file system that it should do a Reed Solomon reconstruction or to
> otherwise reshard that portion of the file.
> 
> At one point I toyed with trying to get something upstream where the
> bad block notification would get sent via a netlink channel.  That way
> userspace can do something appropriate, instead of trying to encode
> what can potentially extremely complicated policy decisions into the
> kernel.  I never had the time to get the design and interface clean
> enough for upstream, though.

Good point,
back to this drive. It appears to be one of drive which report error but
do not remember it in SMART. One day of testing caused SMART to report 2
pending corrupt blocks. After reboot, there was 0 pending and relocated
blocks. It means there is no way to detect hardware degradation with SMART.
It reports no error on write, only some times on read and there is no
guaranty that readed data is not corrupt.
If i see correctly, mostly affected are consumer devices like laptops
and PCs. They do not have RAID, mostly no backups and if they would have
some primitive backup option, it wont really help find and restore
corrupt files.

-- 
Regards,
Oleksij


Download attachment "signature.asc" of type "application/pgp-signature" (296 bytes)