lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.0903301038100.3948@localhost.localdomain>
Date:	Mon, 30 Mar 2009 10:51:24 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Ric Wheeler <rwheeler@...hat.com>
cc:	"Andreas T.Auer" <andreas.t.auer_lkml_73537@...us.ath.cx>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Theodore Tso <tytso@....edu>, Mark Lord <lkml@....ca>,
	Stefan Richter <stefanr@...6.in-berlin.de>,
	Jeff Garzik <jeff@...zik.org>,
	Matthew Garrett <mjg59@...f.ucam.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	David Rees <drees76@...il.com>, Jesper Krogh <jesper@...gh.cc>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 2.6.29



On Mon, 30 Mar 2009, Ric Wheeler wrote:
> >
> > But turn that around, and say: if you don't have redundant disks, then
> > pretty much by definition those drive flushes won't be guaranteeing your
> > data _anyway_, so why pay the price?
> 
> They do in fact provide that promise for the extremely common case of power
> outage and as such, can be used to build reliable storage if you need to.

No they really effectively don't. Not if the end result is "oops, the 
whole track is now unreadable" (regardless of whether it happened due to a 
write durign power-out or during some entirely unrelated disk error). Your 
"flush" didn't result in a stable filesystem at all, it just resulted in a 
dead one.

That's my point. Disks simply aren't that reliable. Anything you do with 
flushing and ordering won't make them magically not have errors any more.

> Heat is a major killer of spinning drives (as is severe cold). A lot of times,
> drives that have read errors only (not failed writes) might be fully
> recoverable if you can re-write that injured sector.

It's not worked for me, and yes, I've tried. Maybe I've been unlucky, but 
every single case I can remember of having read failures, that drive has 
been dead. Trying to re-write just the sectors with the error (and around 
it) didn't do squat, and rewriting the whole disk didn't work either.

I'm sure it works for some "ok, the write just failed to take, and the CRC 
was bad" case, but that's apparently not what I've had. I suspect either 
the track markers got overwritten (and maybe a disk-specific low-level 
reformat would have helped, but at that point I was not going to trust the 
drive anyway, so I didn't care), or there was actual major physical damage 
due to heat and/or head crash and remapping was just not able to cope.

> > Sure. And those "write flushes" really only cover a rather small percentage.
> > For many setups, the other corruption issues (drive failure) are not just
> > more common, but generally more disastrous anyway. So why would a person
> > like that worry about the (rare) power failure?
> 
> This is simply not a true statement from what I have seen personally.

You yourself said that software errors were your biggest issue. The write 
flush wouldn't matter for those (but the elevator barrier would)

> The elevator does not issue write barriers on its own - those write barriers
> are sent down by the file systems for transaction commits.

Right. But "elevator write barrier" vs "sending a drive flush command" are 
two totally independent issues. You can do one without the other (although 
doing a drive flush command without the write barrier is admittedly kind 
of pointless ;^)

And my point is, IT MAKES SENSE to just do the elevator barrier, _without_ 
the drive command. If you worry much more about software (or non-disk 
component) failure than about power failures, you're better off just doing 
the software-level synchronization, and leaving the hardware alone.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ