linux-kernel - Re: IO error semantics

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100125175954.GC2018@laptop>
Date:	Tue, 26 Jan 2010 04:59:54 +1100
From:	Nick Piggin <npiggin@...e.de>
To:	Ric Wheeler <rwheeler@...hat.com>
Cc:	tytso@....edu, Anton Altaparmakov <aia21@....ac.uk>,
	Dave Chinner <david@...morbit.com>, Jan Kara <jack@...e.cz>,
	Hidehiro Kawai <hidehiro.kawai.ez@...achi.com>,
	linux-kernel@...r.kernel.org, linux-ext4@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Andreas Dilger <adilger@....com>,
	Satoshi OSHIMA <satoshi.oshima.fk@...achi.com>,
	linux-fsdevel@...r.kernel.org
Subject: Re: IO error semantics

On Mon, Jan 25, 2010 at 12:50:19PM -0500, Ric Wheeler wrote:
> On 01/25/2010 12:47 PM, tytso@....edu wrote:
> >On Mon, Jan 25, 2010 at 10:23:57AM -0500, Ric Wheeler wrote:
> >>
> >>For permanent write errors, I would expect any modern drive to do a
> >>sector remapping internally. We should never need to track this kind
> >>of information for any modern device that I know of (S-ATA, SAS,
> >>SSD's and raid arrays should all handle this).
> >
> >... and if the device is run out of all of its blocks in its spare
> >blocks pool, it's probably well past the time to replace said disk.
> >
> >BTW, I really liked Dave Chinner's summary of the issues involved; I
> >ran into Kawai-san last week at Linux.conf.au, and we discussed pretty
> >much the same thing over lunch.  (i.e., that it's a hard problem, and
> >in some cases we need to retry the writes, such as a transient FC path
> >problem --- but some kind of write throttling is critical or we could
> >end up choking the VM due to too many pages getting dirtied and no way
> >of cleaning them.)
> >
> >						- Ted
> 
> Also note that retrying writes (or reads for that matter) often are
> counter productive. For those of us who have suffered with trying to
> migrate data off of an old, failing disk onto a new, shiny one,
> excessive retries can be painful...

That is probably true most of the time. So some sane defaults should
be attempted that work for most cases.

After that, retrying I was imagining should be driven by the
application. So: attempting to read or fsync again.

What should not happen is for the page to be marked !dirty or !uptodate.
This randomly breaks write to read consistency without necessarily even
any error reported, so it seems really hard for an app to do the right
thing there.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/