linux-ext4 - Re: Does ext4 perform online update of the bad blocks inode?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Sat, 19 Sep 2009 13:31:23 +0200
From:	Francesco Pretto <ceztkoml@...il.com>
To:	Andreas Dilger <adilger@....com>
Cc:	linux-ext4@...r.kernel.org
Subject: Re: Does ext4 perform online update of the bad blocks inode?

2009/9/18 Andreas Dilger <adilger@....com>:
>
> This isn't even safe on an UNMOUNTED filesystem, since "badblocks"
> by default does destructive testing of the block device.

With destructive testing, I think you mean here a read/write test,
since a read only test isn't supposed to be destructive (usually).
According to badblocks(8), option -n, "By default only a
non-destructive read-only test is done". Moreover, according to
fsck.ext4(8), option -c, "This  option  causes  e2fsck to use
badblocks(8) program to do a read-only scan of the device in order to
find  any  bad  blocks" and later "If  this  option  is specified
twice, then the bad block scan will be done using a non-destructive
read-write test". So I think the *potentially* unsafe command you
meant was "fsck.ext4 -n -c -c device".

Assuming that the manual is correct, and "fsck.ext4 -n -c device" does
really perform a read-only test opening the fs just to update the bad
blocks inode, my question still persists: is safe to launch it weekly
on a mounted filesystem? The wording of the manual seems to tell "yes,
it's supposed to be safe but don't do it because of <unexplained
reason>" :-)

> Since most
> disks will internally relocate bad blocks on writes, it is very
> unlikely that "badblocks" will ever find a problem on a new disk.
>

I'd like to believe you but please read the "smartctl --all" output
(attached) for a Toshiba 120GB notebook drive I recently replaced, or
just observe this excerpt:

  5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail
Always       -       2
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age
Always       -       2
....
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       00%      6366
      57398211
# 2  Extended offline    Completed: read failure       00%      6350
      57398211

So, just 2 sectors reallocated but still read failures that are
visible on the linux block device layer. I can guarantee this: I
extensively repeated read tests on the disk, no way I could force the
drive to relocate more failing sectors using its own SMART mechanism.
So, what I mean is that hw bad blocks relocate features could not work
as expected even on modern drives. Because of bugged implementation?
Don't know.

You didn't answer my main question: does ext4 do something in case of
a read/write failure that is detected in the block device layer?
Exotic filesystems like NTFS (when running Windows, sure) seems to
update its bad blocks list online, so it doesn't seems a bad think for
notebook/desktop users.

The same problem is open for DM users: since evms is deprecated,
there's no more a BBR target. So, for example, your buggy hard drive
doesn't intercept the first and the only failing sector? The error
arrives in the block device layer and the failing drive is
deactived/removed from the RAID volume. Not good for me to throw away
a disk for just one failing sector. This is matter for another mailing
list, so please ignore.

Regards,
Francesco

Download attachment "smartctl-all" of type "application/octet-stream" (11056 bytes)