[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4jTO7peOJ1zUocZiuqejoZmhtUYZbYcM==U+R-frB+sgA@mail.gmail.com>
Date: Mon, 23 Oct 2017 04:20:46 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Martin Schwidefsky <schwidefsky@...ibm.com>
Cc: Christoph Hellwig <hch@....de>,
Andrew Morton <akpm@...ux-foundation.org>,
Jan Kara <jack@...e.cz>,
"linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Heiko Carstens <heiko.carstens@...ibm.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
linux-xfs@...r.kernel.org, Linux MM <linux-mm@...ck.org>,
Jeff Moyer <jmoyer@...hat.com>,
Paul Mackerras <paulus@...ba.org>,
Michael Ellerman <mpe@...erman.id.au>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Ross Zwisler <ross.zwisler@...ux.intel.com>,
Gerald Schaefer <gerald.schaefer@...ibm.com>
Subject: Re: [PATCH v3 02/13] dax: require 'struct page' for filesystem dax
On Mon, Oct 23, 2017 at 3:44 AM, Martin Schwidefsky
<schwidefsky@...ibm.com> wrote:
> On Mon, 23 Oct 2017 01:55:20 -0700
> Dan Williams <dan.j.williams@...el.com> wrote:
>
>> On Sun, Oct 22, 2017 at 10:18 PM, Martin Schwidefsky
>> <schwidefsky@...ibm.com> wrote:
>> > On Fri, 20 Oct 2017 18:29:33 +0200
>> > Christoph Hellwig <hch@....de> wrote:
>> >
>> >> On Fri, Oct 20, 2017 at 08:23:02AM -0700, Dan Williams wrote:
>> >> > Yes, however it seems these drivers / platforms have been living with
>> >> > the lack of struct page for a long time. So they either don't use DAX,
>> >> > or they have a constrained use case that never triggers
>> >> > get_user_pages(). If it is the latter then they could introduce a new
>> >> > configuration option that bypasses the pfn_t_devmap() check in
>> >> > bdev_dax_supported() and fix up the get_user_pages() paths to fail.
>> >> > So, I'd like to understand how these drivers have been using DAX
>> >> > support without struct page to see if we need a workaround or we can
>> >> > go ahead delete this support. If the usage is limited to
>> >> > execute-in-place perhaps we can do a constrained ->direct_access() for
>> >> > just that case.
>> >>
>> >> For axonram I doubt anyone is using it any more - it was a very for
>> >> the IBM Cell blades, which were produceѕ in a rather limited number.
>> >> And Cell basically seems to be dead as far as I can tell.
>> >>
>> >> For S/390 Martin might be able to help out what the status of xpram
>> >> in general and DAX support in particular is.
>> >
>> > The goes back to the time where DAX was called XIP. The initial design
>> > point has been *not* to have struct pages for a large read-only memory
>> > area. There is a block device driver for z/VM that maps a DCSS segment
>> > somewhere in memore (no struct page!) with e.g. the complete /usr
>> > filesystem. The xpram driver is a different beast and has nothing to
>> > do with XIP/DAX.
>> >
>> > Now, if any there are very few users of the dcssblk driver out there.
>> > The idea to save a few megabyte for /usr never really took of.
>> >
>> > We have to look at our get_user_pages() implementation to see how hard
>> > it would be to make it fail if the target address is for an area without
>> > struct pages.
>>
>> For read-only memory I think we can enable a subset of DAX, and
>> explicitly turn off the paths that require get_user_pages(). However,
>> I wonder if anyone has tested DAX with dcssblk because fork() requires
>> get_user_pages()?
>
> I did not test it recently, someone else might have. Gerald?
>
> Looking at the code I see this in the s390 version of gup_pte_range:
>
> mask = (write ? _PAGE_PROTECT : 0) | _PAGE_INVALID | _PAGE_SPECIAL;
> ...
> if ((pte_val(pte) & mask) != 0)
> return 0;
> ...
>
> The XIP code used the pte_mkspecial mechanics to make it work. As far as
> I can see the pfn_t_devmap returns true for the DAX mappins, yes?
Yes, but that's only for get_user_pages_fast() support.
> Then I would say that dcssblk and DAX currently do not work together.
I think at a minimum we need a new pfn_t flag for the 'special' bit to
at least indicate that DAX mappings of dcssblk and axonram do not
support normal get_user_pages(). Then I don't need to explicitly
disable DAX in the !pfn_t_devmap() case. I think I also want to split
the "pfn_to_virt()" and the "sector to pfn" operations into distinct
dax_operations rather than doing both in one ->direct_access(). This
supports storing pfns in the fs/dax radix rather than sectors.
In other words, the pfn_t_devmap() requirement was only about making
get_user_pages() safely fail, and pte_special() fills that
requirement.
Powered by blists - more mailing lists