[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAPcyv4hBw1cDmi836Gx1xjXfhmswgL=P1exBMjTFGaRV_Osiig@mail.gmail.com>
Date: Wed, 29 Jun 2016 07:41:11 -0700
From: Dan Williams <dan.j.williams@...el.com>
To: Yigal Korman <yigal@...xistor.com>
Cc: "H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, X86 ML <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-nvdimm@...ts.01.org" <linux-nvdimm@...1.01.org>
Subject: Re: [RESEND PATCH] x86/mm: only allow memmap=XX!YY over existing RAM
On Tue, Jun 28, 2016 at 10:12 PM, Yigal Korman <yigal@...xistor.com> wrote:
> On Wed, Jun 29, 2016 at 4:09 AM, Dan Williams <dan.j.williams@...el.com> wrote:
>>
>> On Tue, Jun 28, 2016 at 10:58 AM, H. Peter Anvin <hpa@...or.com> wrote:
>> > On 06/28/16 09:33, Dan Williams wrote:
>> >> On Tue, Jun 28, 2016 at 1:31 AM, Yigal Korman <yigal@...xistor.com> wrote:
>> >>> Before this patch, passing a range that is beyond the physical memory
>> >>> range will succeed, the user will see a /dev/pmem0 and will be able to
>> >>> access it. Reads will always return 0 and writes will be silently
>> >>> ignored.
>> >>>
>> >>> I've gotten more than one bug report about mkfs.{xfs,ext4} or nvml
>> >>> failing that were eventually tracked down to be wrong values passed to
>> >>> memmap.
>> >>>
>> >>> This patch prevents the above issue by instead of adding a new memory
>> >>> range, only update a RAM memory range with the PRAM type. This way,
>> >>> passing the wrong memmap will either not give you a pmem at all or give
>> >>> you a smaller one that actually has RAM behind it.
>> >>>
>> >>> And if someone still needs to fake a pmem that doesn't have RAM behind
>> >>> it, they can simply do memmap=XX@YY,XX!YY.
>> >>>
>> >>> Signed-off-by: Yigal Korman <yigal@...xistor.com>
>> >>> Acked-by: Dan Williams <dan.j.williams@...el.com>
>> >>> Acked-by: Johannes Thumshirn <jthumshirn@...e.de>
>> >>> Tested-by: Boaz Harrosh <boaz@...xistor.com>
>> >>> ---
>> >>
>> >> I have some other libnvdimm fixes heading upstream shortly if x86
>> >> folks just want to ack this one...
>> >>
>> >
>> > I'm concerned about this. This would mean that memory not marked as RAM
>> > because it is persistent and you don't want the OS to accidentally
>> > clobber persistent RAM can't be added.
>>
>> Ah true. Specifically you are worried about the case where a
>> platform-firmware has mis-marked pmem as reserved memory (or some
>> other type) and would like to correct it to be pram.
>
>
> As I mentioned in the patch, this is still possible by doing memmap=X@Y,X!Y
Sorry, yes I overlooked this in my response to Peter.
> Also, with fixes in grub and the kernel regarding mis-marking NVDIMMs
> this is much less common today.
> My purpose was simply to prevent a repeating user error for the common use case.
It makes sense, but at the same time it still involves a user
education burden that memmap=X!Y is constrained by default.
>>
>>
>> > So it seems like The Wrong
>> > Thing. If all you want is simulated pram then it shouldn't care about
>> > addresses in the first place and instead we should just specify it by
>> > quantity.
>>
>> Yes, agree we need an explicit "simulate pram" option independent of
>> memmap=, or just continue to educate users that if they try to
>> simulate pmem and specify an invalid range they get to keep all the
>> broken pieces.
>
> I'd love to have a simpler way to specify simulated pram, but quantity
> is not good enough.
> For my use case, for example, I need the quantity to be spread evenly
> over all NUMA nodes, so just getting a range "somewhere" is not good.
> And I can imagine other users that want to pin pram to same socket
> where their high speed NIC sits.
> So I not sure we can find a better general api than memmap and I not
> sure it's worth it.
I think it would be worthwhile to have something like testpmem= which
takes the same parameters as memmap=, but it is constrained to RAM by
default. Then it becomes clear that the user really does want a
simulation configuration on RAM rather than explicitly specifying a
new persistent memory range.
Powered by blists - more mailing lists