[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5520FDCB.80505@plexistor.com>
Date: Sun, 05 Apr 2015 12:18:03 +0300
From: Boaz Harrosh <boaz@...xistor.com>
To: Yinghai Lu <yinghai@...nel.org>, Toshi Kani <toshi.kani@...com>
CC: Jens Axboe <axboe@...nel.dk>, linux-nvdimm@...1.01.org,
the arch/x86 maintainers <x86@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-fsdevel@...r.kernel.org, Christoph Hellwig <hch@....de>
Subject: Re: [Linux-nvdimm] [PATCH 1/2] x86: add support for the non-standard
protected e820 type
On 04/03/2015 08:12 PM, Yinghai Lu wrote:
> On Fri, Apr 3, 2015 at 9:14 AM, Toshi Kani <toshi.kani@...com> wrote:
>> On Wed, 2015-04-01 at 09:12 +0200, Christoph Hellwig wrote:
>> :
>>> @@ -748,7 +758,7 @@ u64 __init early_reserve_e820(u64 size, u64 align)
>>> /*
>>> * Find the highest page frame number we have available
>>> */
>>> -static unsigned long __init e820_end_pfn(unsigned long limit_pfn, unsigned type)
>>> +static unsigned long __init e820_end_pfn(unsigned long limit_pfn)
>>> {
>>> int i;
>>> unsigned long last_pfn = 0;
>>> @@ -759,7 +769,11 @@ static unsigned long __init e820_end_pfn(unsigned long limit_pfn, unsigned type)
>>> unsigned long start_pfn;
>>> unsigned long end_pfn;
>>>
>>> - if (ei->type != type)
>>> + /*
>>> + * Persistent memory is accounted as ram for purposes of
>>> + * establishing max_pfn and mem_map.
>>> + */
>>> + if (ei->type != E820_RAM && ei->type != E820_PRAM)
>>> continue;
>>
>> Should we also delete this code, accounting E820_PRAM as ram, along with
>> the deletion of reserve_pmem() in this version?
>
Hi Yinghai, Toshi
In my old patches I did not have these updates as well, and everything
was very much usable, for a long time.
However. I actually liked these changes in Christoph's patches and
thought they should stay, here is why.
Today I will be sending patches to make pmem be supported with
page-struct as an optional alternative to the use of ioremap.
This is for advanced users that wants to RDMA direct_IO and so
on directly out of pmem.
At one point we had a BUG in some mm/memory.c code that was checking max_pfn.
Actually that was a bug and we do not go through this code anymore. And between
us that global variable max_pfn is a bad hack. But I kind of like to have it as
long as it is used. So code that wants to protect by max_pfn can still accept
pmem memory submitted to it.
I have tried to audit the Kernel use of max_pfn and I do not see how
this can hurt? I do see were it would theoretically help.
Think of a system that looks like this as a memory map:
1. VM (Volitile mem)
2. PM
3. VM
4. PM
Which is what is returned by current and planned NUMA implementations.
So pmem region-2 will be covered by max_pfn. But pmem region 4 will not.
If any code checks for max_pfn it will be OK with pmem-2 but *not* with
pmem-4. This is highly unexpected.
I think the all max_pfn should be killed ASAP, but until it is then
it will not hurt for pmem to be covered.
Thanks
Boaz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists