[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4gNLcw5ucb65RjRyuh=22vMDmrmyh5erQ50uJ45s-UMEQ@mail.gmail.com>
Date: Mon, 22 Jun 2015 09:54:51 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Christoph Hellwig <hch@....de>
Cc: Jens Axboe <axboe@...nel.dk>,
"linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
Boaz Harrosh <boaz@...xistor.com>,
"Kani, Toshimitsu" <toshi.kani@...com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Linux ACPI <linux-acpi@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>
Subject: Re: [PATCH 14/15] libnvdimm: support read-only btt backing devices
On Mon, Jun 22, 2015 at 9:45 AM, Christoph Hellwig <hch@....de> wrote:
> On Mon, Jun 22, 2015 at 09:36:50AM -0700, Dan Williams wrote:
>> In that case "don't stack" is too coarse of a hammer. I see this as a
>> request to hide the subordinate ULD which is a new capability that DM
>> and MD might benefit from as well. We already have the case in MD
>> where it internally holds a reference to bdev that has been hot
>> removed, it seems not much of a stretch to have stacking drivers be
>> able to hide device nodes for bdevs that they are holding.
>
> I don't see why you're comparing with MD and DM here. MD and DM
> sit cleanly ontop of any block device. If btt was independent of
> libnvdimm and just used ->rw_bytes we could see it as this.
>
> But it's all a giant entangled mess, where btt for example is probed
> by libnvdimm. At the same time pmem.c isn't really a true block
> driver, it's really just a trivial shim between the block API
> and pmem-style memcpy. Especially with the proper pmem API btt
> would become cleaner just calling that directly.
The pmem api does nothing to fix torn sectors, there's no extra
atomicity guarantees that come from those instructions.
>> Yes, if they want to use DAX they should do it consciously and audit
>> their application to be sure it is safe to abandon atomic sector
>> guarantees. With the current flexibility to do BTT on a partition
>> they can do this conversion piecemeal and, for example, keep metadata
>> on BTT and data on DAX.
>
> By that logic you'd want to attach BTT by default and allow opt-out
> at some level. This could be a libnvmdimm-level partitioning scheme,
> which would also allow storing the bit if BTT is used or not persistently.
> Or it could be on fine grained boundaries which might be more useful.
Well, let's start with per-disk btt and see where that gets us, we can
always ramp up complexity later. I'd just as soon make the default
opt-in/out a Kconfig toggle with a sysfs override.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists