linux-kernel - Re: Linux 2.6.29

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.00.0903280916230.3994@localhost.localdomain>
Date:	Sat, 28 Mar 2009 09:32:36 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Stefan Richter <stefanr@...6.in-berlin.de>
cc:	Mark Lord <lkml@....ca>, Jeff Garzik <jeff@...zik.org>,
	Matthew Garrett <mjg59@...f.ucam.org>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Theodore Tso <tytso@....edu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	David Rees <drees76@...il.com>, Jesper Krogh <jesper@...gh.cc>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 2.6.29

On Sat, 28 Mar 2009, Stefan Richter wrote:
> 
> Sure.  I forgot:  Not only the frequency of I/O disruption (e.g. due to
> kernel crash) factors into system reliability; the particular impact of
> such disruption is a factor too.  (How hard is recovery?  Will at least
> old data remain available? ...)

I suspect (at least from my own anecdotal evidence) that a lot of system 
crashes are basically X hanging. If you use the system as a desktop, at 
that point it's basically dead - and the difference between an X hang and 
a kernel crash is almost totally invisible to users.

Us kernel people may walk over to another machine and ping or ssh in to 
see, but ask yourself how many normal users would do that - especially 
since DOS and Windows has taught people that they need to power-cycle 
(and, in all honesty, especially since there usually is very little else 
you can do even under Linux if X gets confused).

And then part of the problem ends up being that while in theory the kernel 
can continue to write out dirty stuff, in practice people press the power 
button long before it can do so. The 30 second thing is really too long.

And don't tell me about sysrq. I know about sysrq. It's very convenient 
for kernel people, but it's not like most people use it.

But I absolutely hear you - people seem to think that "correctness" trumps 
all, but in reality, quite often users will be happier with a faster 
system - even if they know that they may lose data. They may curse 
themselves (or, more likely, the system) when they _do_ lose data, but 
they'll make the same choice all over two months later.

Which is why I think that if the filesystem people think that the 
"data=ordered" mode is too damn fundamentally hard to make fast in the 
presense of "fsync", and all sane people (definition: me) think that the 
30-second window for either "data=writeback" or the ext4 data writeout is 
too fragile, then we should look into something in between.

Because, in the end, you do have to balance performance vs safety when it 
comes to disk writes. You absolutely have to delay things for performance, 
but it is always going to involve the risk of losing data that you do care 
about, but that you aren't willing (or able - random apps and tons of 
scripting comes to mind) to do a fsync over.

Which is why I, personally, would probably be perfectly happy with a 
"async ordered" mode, for example. At least START the data writeback when 
writing back metadata, but don't necessarily wait for it (and don't 
necessarily make it go first). Turn the "30 second window of death" into 
something much harder to hit.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/