[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4iuSvx0kYLOaBXP57M-AgStubGg08Td7s_08G=cZTYWdw@mail.gmail.com>
Date: Tue, 23 Aug 2016 22:48:46 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: "Kani, Toshimitsu" <toshi.kani@....com>
Cc: "Mulumudi, Abhilash Kumar" <m.abhilash-kumar@....com>,
"linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
"ard.biesheuvel@...aro.org" <ard.biesheuvel@...aro.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"brian.starkey@....com" <brian.starkey@....com>
Subject: Re: [PATCH] memremap: Fix NULL pointer BUG in get_zone_device_page()
On Tue, Aug 23, 2016 at 8:58 PM, Dan Williams <dan.j.williams@...el.com> wrote:
> On Tue, Aug 23, 2016 at 7:53 PM, Dan Williams <dan.j.williams@...el.com> wrote:
>> On Tue, Aug 23, 2016 at 6:29 PM, Kani, Toshimitsu <toshi.kani@....com> wrote:
>>>> On Tue, Aug 23, 2016 at 4:47 PM, Kani, Toshimitsu <toshi.kani@....com>
>>>> wrote:
>>>> > On Tue, 2016-08-23 at 15:32 -0700, Dan Williams wrote:
>>>> >> On Tue, Aug 23, 2016 at 11:43 AM, Toshi Kani <toshi.kani@....com>
>>>> >> wrote:
>>>> > :
>>>> >> I'm not sure about this fix. The point of honoring
>>>> >> vmem_altmap_offset() is because a portion of the resource that is
>>>> >> passed to devm_memremap_pages() also contains the metadata info
>>>> block
>>>> >> for the device. The offset says "use everything past this point for
>>>> >> pages". This may work for avoiding a crash, but it may corrupt info
>>>> >> block metadata in the process. Can you provide more information
>>>> >> about the failing scenario to be sure that we are not triggering a
>>>> >> fault on an address that is not meant to have a page mapping? I.e.
>>>> >> what is the host physical address of the page that caused this fault,
>>>> >> and is it valid?
>>>> >
>>>> > The fault address in question was the 2nd page of an NVDIMM range. I
>>>> > assumed this fault address was valid and needed to be handled. Here is
>>>> > some info about the base and patched cases. Let me know if you need
>>>> > more info.
>>>> >
>>>> > Base
>>>> > ====
>>>> >
>>>> > The following NVDIMM range was set to /dev/dax.
>>>>
>>>> With ndctl create-namespace or manually via sysfs? Specifically I'm
>>>> looking for what the 'align' attribute was set to when the
>>>> configuration was established. Can you provide a dump of the sysfs
>>>> attributes for the /dev/dax parent device?
>>>
>>> I used the ndctl command below.
>>> ndctl create-namespace -f -e namespace0.0 -m dax
>>>
>>> Here is additional info from my note for the base case.
>>>
>>> p {struct dev_pagemap} 0xffff88046d0453f0
>>> $3 = {
>>> altmap = 0xffff88046d045410,
>>> res = 0xffff88046d0453a8,
>>> ref = 0xffff88046d0452f0,
>>> dev = 0xffff880464790410
>>> }
>>>
>>> crash> p {struct vmem_altmap} 0xffff88046d045410
>>> $6 = {
>>> base_pfn = 0x480000,
>>> reserve = 0x2, // PHYS_PFN(SZ_8K)
>>> free = 0x101fe,
>>> align = 0x1fe,
>>> alloc = 0x10000
>>> }
>>
>> Ah, so, on second look the 0x490200000 data offset looks correct. The
>> total size of the address range is 16GB which equates to 256MB needed
>> for struct page, plus 2MB more to re-align the data on the next 2MB
>> boundary.
>>
>> The question now is why is the guest faulting on an access to an
>> address less than 0x490200000?
>
> Does the attached patch fix this for you?
Sorry, should be this much simpler patch that also mirrors what
driver/nvdimm/pmem.c is doing...
View attachment "0001-dax-fix-device-dax-region-base.patch" of type "text/x-patch" (2245 bytes)
Powered by blists - more mailing lists