lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Thu, 7 Aug 2008 14:53:18 -0400
From:	"John Stoffel" <john@...ffel.org>
To:	"Martin K. Petersen" <martin.petersen@...cle.com>
Cc:	linasvepstas@...il.com, "Alan Cox" <alan@...rguk.ukuu.org.uk>,
	"John Stoffel" <john@...ffel.org>,
	"Alistair John Strachan" <alistair@...zero.co.uk>,
	linux-kernel@...r.kernel.org
Subject: Re: amd64 sata_nv (massive) memory corruption

>>>>> "Martin" == Martin K Petersen <martin.petersen@...cle.com> writes:

>>>>> "Linas" == Linas Vepstas <linasvepstas@...il.com> writes:
Linas> My problem is that the corruption I see is "silent": so
Linas> redundancy is useless, as I cannot distinguish good blocks from
Linas> bad.  I'm running RAID, one of the two disks returns bad data.
Linas> Without checksums, I can't tell which version of a block is the
Linas> good one.

Martin> But btrfs can.

Maybe.  I'd not trust btrfs even now because the on-disk format is
going to change yet again from the currently released version.  I'm
personally interested in it, but not quite enough to use it.  :]

Linas> There is also in interesting possibility that offers a middle
Linas> ground between raw performance and safety: instead of verifying
Linas> checksums on *every* read access, it could be enough to verify
Linas> only every so often -- say, only one out of every 10 reads, or
Linas> maybe triggered by a cron job in the middle of the night: turn
Linas> on verification, touch a bunch of files for an hour or two,
Linas> turn off verification before 6AM.

If you're reading the file off disk, it doesn't cost anything to
verify it then, esp if the checksum is either in the metadata or next
to the blocks themselves.  

It's corruption in files which aren't read which turns into a
problem.  

Martin> All evidence suggests that scrubbing is a good way to keep
Martin> your data healthy.

Yup.  And mirroring anything you think is important.  Disk is cheap,
mirroring is good.

Heck, I'd pay good money for a SATA disk which mirrored inside itself
or which joined two seperate spindle/head assemblies into one and did
all the error correction at a low level.  

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ