[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d073b4a1-ff90-b44a-adf2-ad44c0d70c69@hpe.com>
Date: Thu, 29 Jun 2017 18:12:52 -0400
From: Linda Knippers <linda.knippers@....com>
To: Dan Williams <dan.j.williams@...el.com>
CC: "linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
Jan Kara <jack@...e.cz>,
Matthew Wilcox <mawilcox@...rosoft.com>,
X86 ML <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Al Viro <viro@...iv.linux.org.uk>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Christoph Hellwig <hch@....de>
Subject: Re: [PATCH v4 12/16] libnvdimm, nfit: enable support for volatile
ranges
On 6/29/2017 5:50 PM, Dan Williams wrote:
> On Thu, Jun 29, 2017 at 2:16 PM, Linda Knippers <linda.knippers@....com> wrote:
>> On 06/29/2017 04:42 PM, Dan Williams wrote:
>>> On Thu, Jun 29, 2017 at 12:20 PM, Linda Knippers <linda.knippers@....com> wrote:
>>>> On 06/29/2017 01:54 PM, Dan Williams wrote:
>>>>> Allow volatile nfit ranges to participate in all the same infrastructure
>>>>> provided for persistent memory regions.
>>>>
>>>> This seems to be a bit more than "other rework".
>>>
>>> It's part of the rationale for having a "write_cache" control
>>> attribute. There's only so much I can squeeze into the subject line,
>>> but it is mentioned in the cover letter.
>>>
>>>>> A resulting resulting namespace
>>>>> device will still be called "pmem", but the parent region type will be
>>>>> "nd_volatile".
>>>>
>>>> What does this look like to a user or admin? How does someone know that
>>>> /dev/pmemX is persistent memory and /dev/pmemY isn't? Someone shouldn't
>>>> have to weed through /sys or ndctl some other interface to figure that out
>>>> in the future if they don't have to do that today. We have different
>>>> names for BTT namespaces. Is there a different name for volatile ranges?
>>>
>>> No, the block device name is still /dev/pmem. It's already the case
>>> that you need to check behind just the name of the device to figure
>>> out if something is actually volatile or not (see memmap=ss!nn
>>> configurations),
>>
>> I don't have any experience with using memmap but if it's primarily used
>> by developers without NVDIMMs, they'd know it's not persistent. Or is it
>> primarily used by administrators using non-NFIT NVDIMMs, in which case it
>> is persistent?
>>
>> In any case, how exactly does one determine whether the device is volatile
>> or not? I'm dumb so tell me the command line or API.
>
> Especially with memmap= or e820-defined memory it's unknowable from
> the kernel. We don't know if the user is using it to cover for a
> platform where there is no BIOS support for advertising persistent
> memory, or if they have a BIOS that does not produce an NFIT as is the
> case here [1], or if it is some developer just testing with no
> expectation of persistence.
>
> [1]: https://github.com/pmem/ndctl/issues/21
Ok. I'm not really concerned about those cases but was asking since
you mentioned memmap as an example.
In any case, how does someone, like a system administrator, confirm that
a /dev/pmem device is a device that claims to be persistent? Is there
a specific ndctl command line that would make it obvious what the Linux
device is on a device that claims to be persistent?
>>> so I would not be in favor of changing the device
>>> name if we think the memory might not be persistent. Moreover, I think
>>> it was a mistake that we change the device name for btt or not, and
>>> I'm glad Matthew talked me out of making the same mistake with
>>> memory-mode vs raw-mode pmem namespaces. So, the block device name
>>> just reflects the driver of the block device, not the properties of
>>> the device, just like all other block device instances.
>>
>> I agree that creating a new device name for BTT was perhaps a mistake,
>> although it would be good to know how to query a device property for
>> sector atomicity. The difference between BTT vs. non-BTT seems less
>> critical to me than knowing in an obvious way whether the device is
>> actually persistent.
>
> We don't have a good way to answer "actually persistent" in the
> general case. I'm thinking of cases where the energy source on the
> DIMM has died, or we trigger one of the conditions that leads to the
> ""unable to guarantee persistence of writes" message.
There are certainly error conditions that can happen and we've talked
about that bit in our health status discussions. I think the question
of whether the device is healthy enough to be persistent right now
is different from whether the device is never ever going to be persistent.
> The /dev/pmem
> device name just tells you that your block device is hosted by a
> driver that knows how to handle persistent memory constraints, but any
> other details about the nature of the address range need to come from
> other sources of information, and potentially information sources that
> the kernel does not know about.
I'm asking about the other source of information in this specific case
where we're exposing pmem devices that will never ever be persistent.
Before we add these devices, I think we should be able to tell the user
how they can know the properties of the underlying device.
-- ljk
Powered by blists - more mailing lists