[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4j=M8V_36C-HhiJM7MHzNLFcpP=nec=LHnob5+qZ4xgYw@mail.gmail.com>
Date: Thu, 19 Mar 2015 13:59:27 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Boaz Harrosh <boaz@...xistor.com>, linux-arch@...r.kernel.org,
Jens Axboe <axboe@...nel.dk>, riel@...hat.com,
linux-raid <linux-raid@...r.kernel.org>,
linux-nvdimm <linux-nvdimm@...1.01.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Christoph Hellwig <hch@...radead.org>,
Mel Gorman <mgorman@...e.de>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [Linux-nvdimm] [RFC PATCH 0/7] evacuate struct page from the
block layer
On Thu, Mar 19, 2015 at 12:59 PM, Andrew Morton
<akpm@...ux-foundation.org> wrote:
> On Thu, 19 Mar 2015 17:54:15 +0200 Boaz Harrosh <boaz@...xistor.com> wrote:
>
>> On 03/19/2015 03:43 PM, Matthew Wilcox wrote:
>> <>
>> >
>> > Dan missed "Support O_DIRECT to a mapped DAX file". More generally, if we
>> > want to be able to do any kind of I/O directly to persistent memory,
>> > and I think we do, we need to do one of:
>> >
>> > 1. Construct struct pages for persistent memory
>> > 1a. Permanently
>> > 1b. While the pages are under I/O
>> > 2. Teach the I/O layers to deal in PFNs instead of struct pages
>> > 3. Replace struct page with some other structure that can represent both
>> > DRAM and PMEM
>> >
>> > I'm personally a fan of #3, and I was looking at the scatterlist as
>> > my preferred data structure. I now believe the scatterlist as it is
>> > currently defined isn't sufficient, so we probably end up needing a new
>> > data structure. I think Dan's preferred method of replacing struct
>> > pages with PFNs is actually less instrusive, but doesn't give us as
>> > much advantage (an entirely new data structure would let us move to an
>> > extent based system at the same time, instead of sticking with an array
>> > of pages). Clearly Boaz prefers 1a, which works well enough for the
>> > 8GB NV-DIMMs, but not well enough for the 400GB NV-DIMMs.
>> >
>> > What's your preference? I guess option 0 is "force all I/O to go
>> > through the page cache and then get copied", but that feels like a nasty
>> > performance hit.
>>
>> Thanks Matthew, you have summarized it perfectly.
>>
>> I think #1b might have merit, as well.
>
> It would be interesting to see what a 1b implementation looks like and
> how it performs. We already allocate a bunch of temporary things to
> support in-flight IO (bio, request) and allocating pageframes on the
> same basis seems a fairly logical fit.
At least for block-i/o it seems the only place we really need struct
page infrastructure is for kmap(). Given we already need a kmap_pfn()
solution for option 2 a "dynamic allocation" stop along that
development path may just naturally fall out.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists