linux-kernel - Re: Mis-Design of Btrfs?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAGh96GGJusudLzNePVVKqOwm8U9=HAFyz43Zm9Mypy2F5-PdSw@mail.gmail.com>
Date:	Thu, 14 Jul 2011 13:50:31 -0700
From:	Erik Jensen <eriksjunk@...nsn.net>
To:	John Stoffel <john@...ffel.org>
Cc:	Alasdair G Kergon <agk@...hat.com>, NeilBrown <neilb@...e.de>,
	Ric Wheeler <rwheeler@...hat.com>,
	Nico Schottelius <nico-lkml-20110623@...ottelius.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Chris Mason <chris.mason@...cle.com>,
	linux-btrfs <linux-btrfs@...r.kernel.org>
Subject: Re: Mis-Design of Btrfs?

 On Thu, Jul 14, 2011 at 12:50 PM, John Stoffel <john@...ffel.org> wrote:
>>>>>> "Alasdair" == Alasdair G Kergon <agk@...hat.com> writes:
>
> Alasdair> On Thu, Jul 14, 2011 at 04:38:36PM +1000, Neil Brown wrote:
>>> It might make sense for a device to be able to report what the maximum
>>> 'N' supported is... that might make stacked raid easier to manage...
>
> Alasdair> I'll just say that any solution ought to be stackable.
>
> I've been mulling this over too and wondering how you'd handle this,
> because upper layers really can't peak down into lower layers easily.
> As far as I understand things.
>
> So if you have btrfs -> luks -> raid1 -> raid6 -> nbd -> remote disks
>
> How does btrfs handle errors (or does it even see them!) from the
> raid6 level when a single nbd device goes away?  Or taking the
> original example, when btrfs notices a checksum isn't correct, how
> would it push down multiple levels to try and find the correct data?
>
> Alasdair> This means understanding both that the number of data access
> Alasdair> routes may vary as you move through the stack, and that this
> Alasdair> number may depend on the offset within the device.
>
> It almost seems to me that the retry needs to be done at each level on
> it's own, without pushing down or up the stack.  But this doesn't
> handle the wrong file checksum issue.
>
> Hmm... maybe instead of just one number, we need another to count the
> levels down you go (or just split 16bit integer in half, bottom half
> being count of tries, the upper half being levels down to try that
> read?)
>
> It seems to defeat the purpose of layers if you can go down and find
> out how many layers there are underneath you....
>
> John

A random thought: What if we allow the number to wrap at each level,
and, each time it wraps, increment the number passed to the next lower
level.

A zero would propagate down, letting each level do what it wants:
luks: 0
raid1: 0
raid6: 0
nbd: 0

And higher numbers would indicate the method at each level:

For a 1:
luks: 1
raid1: 1
raid6: 1
nbd: 1

For a 3:
luks: 1 (only one possibility, passes three down)
raid1: 1 (two possibilities, so we wrap back to one and pass two down,
since we wrapped once)
raid6: 2 (not wrapped)
nbd: 1

When the bottom-most level gets an N that it can't handle, it would
return EINVAL, which would be propagated up the stack.

This would allow the same algorithm of incrementing N until we receive
good data or EINVAL, and would exhaust all ways of reading the data at
all levels.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/