lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 26 May 2011 06:49:46 -0400
From:	Theodore Tso <tytso@....EDU>
To:	"D. Jansen" <d.g.jansen@...glemail.com>
Cc:	Oliver Neukum <oneukum@...e.de>,
	Dave Chinner <david@...morbit.com>,
	linux-kernel@...r.kernel.org, akpm@...ux-foundation.org
Subject: Re: [rfc] Ignore Fsync Calls in Laptop_Mode


On May 26, 2011, at 3:01 AM, D. Jansen wrote:

> That seems to be the big ordering issue. I had always assumed that
> user space writes (by the same app to the same file) would be
> committed in order. Is that really not the case?
> 
> Wouldn't most app programmers assume ordering? Wouldn't that always
> possibly be an issue? Or do all the apps that require ordered writes
> use fsync. There will surely be some who require ordering but don't
> fsync. And without ordering, some apps won't be able to avoid fsync
> without data safety issues.

I really don't like using the word "ordering" the way Dave used it,
because it's a file system lingo that *always* confuses civilians.
And "Insider" language like that isn't help for communication,
unless you're certain there are only experts in the room...

As Dave said earlier, "ordering" in the sense he was using it
refers strictly to ensuring consistency after a crash.

Now, there are two levels of consistency; one is file system
level consistency, and the other is application level 
consistency.   It used to be that desktop drives would
lie about forcing data to disk in response to a FLUSH 
CACHE command, "yes sir, I promise the data is on
the disk, sir!", because it resulted in higher WINBENCH
scores.  File systems engineers hated this, because
a primary tool we have for assuring that file systems
don't look like swiss cheese after a crash was completely
unreliable.   Fortunately, those disks have largely
disappeared from the market place.

The suggestion of making fsync a no-op is essentially
asking for a knob that breaks application-level consistency
the same way those broken hard drives broke file system
consistency by making the FLUSH CACHE command
unreliable.   Maybe improving battery lifetime is a more
honorable excuse than the purely mercenary goal of
selling more disk drives, but it can still break applications
after a crash.

Now, you may think that you're prepared by that.   After all,
you're already prepared to say that you're willing to lose
the last 15 minutes of work or whatever, right?

Well, wrong.  It's not so simple as that.  If you're only
talking about simple, flat, human-readable text files,
maybe it would work that way.   But what about complex,
binary databases?  Like sqllite databases used by 
Firefox and Chrome?   Or MySQL databases?   More
and more, sophisticated applications, even desktop
applications, are using these complex data stores,
and the libraries which update these complex data
stores rely on fsync() to prevent their database files
from looking like swiss cheese.   If you crash while
fsync() has been disabled, the entire database file
could be completely trashed, which could be hours,
days, weeks, or months of work lost.

So the resistance that people like Dave have to your 
proposal can be summed up by Confucious if you are
Chinese: ""Never impose on others what you would
not choose for yourself."  Or if you are Jewish, the Rabbi
Hillel said: "That which is hateful to you, do not do to
your fellow. That is the whole Torah; the rest is the
explanation; go and learn."   Or if you are a Muslim,
the Prophet Mohammed: "Hurt no one so that no one
may hurt you."   Breaking fsync() is like hard drives that
break faith with file system authors by lying when they 
say everything is safely written to stable storage.  And 
what are databases but complex file systems living inside
a single file?

-- Ted

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ