linux-kernel - Re: [PATCH 00/16] DRBD: a block device for HA clusters

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 05 May 2009 21:53:57 +0000
From:	James Bottomley <James.Bottomley@...senPartnership.com>
To:	Philipp Reisner <philipp.reisner@...bit.com>
Cc:	david@...g.hm, Willy Tarreau <w@....eu>,
	Bart Van Assche <bart.vanassche@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, Jens Axboe <jens.axboe@...cle.com>,
	Greg KH <gregkh@...e.de>, Neil Brown <neilb@...e.de>,
	Sam Ravnborg <sam@...nborg.org>, Dave Jones <davej@...hat.com>,
	Nikanth Karthikesan <knikanth@...e.de>,
	Lars Marowsky-Bree <lmb@...e.de>,
	Kyle Moffett <kyle@...fetthome.net>,
	Lars Ellenberg <lars.ellenberg@...bit.com>
Subject: Re: [PATCH 00/16] DRBD: a block device for HA clusters

On Tue, 2009-05-05 at 23:45 +0200, Philipp Reisner wrote:
> > I also think you're not quite looking at the important case: if you
> > think about it, the real necessity for the ordered domain is the
> > network, not so much the actual secondary server.  The reason is that
> > it's very hard to find a failure case where the write order on the
> > secondary from the network tap to disk actually matters (as long as the
> > flight into the network tap was in order).  The standard failure is of
> > the primary, not the secondary, so the network stream stops and so does
> > the secondary writing: as long as we guarantee to stop at a consistent
> > point in flight, everything works.  If the secondary fails while the
> > primary is still up, that's just a standard replay to bring the
> > secondary back into replication, so the issue doesn't arise there
> > either.
> 
> A common power failure is possible. We aim for an HA system, we can
> not ignore a possible failure scenario. No user will buy: Well in most
> scenarios we do it correctly, in the unlikely case of a common power
> failure, and you loose your former primary at the same time, you might
> have a secondary with the last write but not that one write before!
> 
> Correctness before efficiency!

Well, you have to agree that during a resync from the activity log,
which plays up the primary disk from one end to another, the secondary
is completely corrupt if a primary failure occurs before the resync
completes.  That's something that's triggered by a network outage, and
so is a far more common event than cascading dual failures.  It's all
really a question of where you focus your effort to eliminate the corner
cases.

> But I will now stop this discussion now. Proving that DRBD does some
> details better than the md/nbd approch gets pointless, when we agreed
> that DRBD can get merged as a driver. We will focus on the necessary
> code cleanups.

I agree.  Also HA is full of corner cases like this and opinion is
endlessly divided over which corner cases are more important than which
others.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/