linux-kernel - Re: [dm-devel] Announcement: STEC EnhanceIO SSD caching software for Linux kernel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130124234524.GQ26407@google.com>
Date:	Thu, 24 Jan 2013 15:45:24 -0800
From:	Kent Overstreet <koverstreet@...gle.com>
To:	Amit Kale <akale@...c-inc.com>
Cc:	"thornber@...hat.com" <thornber@...hat.com>,
	device-mapper development <dm-devel@...hat.com>,
	"kent.overstreet@...il.com" <kent.overstreet@...il.com>,
	Mike Snitzer <snitzer@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	"linux-bcache@...r.kernel.org" <linux-bcache@...r.kernel.org>
Subject: Re: [dm-devel] Announcement: STEC EnhanceIO SSD caching software for
 Linux kernel

On Thu, Jan 17, 2013 at 03:39:40AM -0800, Kent Overstreet wrote:
> Suppose I could fill out the bcache version...
> 
> On Thu, Jan 17, 2013 at 05:52:00PM +0800, Amit Kale wrote:
> > 11. Error conditions - Handling power failures, intermittent and permanent device failures.
> 
> Power failures and device failures yes, intermittent failures are not
> explicitly handled.

Coworker pointed out bcache actually does handle some intermittent io errors. I
just added error handling to the documentation: 
http://atlas.evilpiepirate.org/git/linux-bcache.git/tree/Documentation/bcache.txt?h=bcache-dev

To cut and paste,

Bcache tries to transparently handle IO errors to/from the cache device without
affecting normal operation; if it sees too many errors (the threshold is
configurable, and defaults to 0) it shuts down the cache device and switches all
the backing devices to passthrough mode.

 - For reads from the cache, if they error we just retry the read from the
   backing device.

 - For writethrough writes, if the write to the cache errors we just switch to
   invalidating the data at that lba in the cache (i.e. the same thing we do for
   a write that bypasses the cache)
 
 - For writeback writes, we currently pass that error back up to the
   filesystem/userspace. This could be improved - we could retry it as a write
   that skips the cache so we don't have to error the write.

 - When we detach, we first try to flush any dirty data (if we were running in
   writeback mode). It currently doesn't do anything intelligent if it fails to
   read some of the dirty data, though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/