[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4FD7F329.1000203@jp.fujitsu.com>
Date: Wed, 13 Jun 2012 10:55:53 +0900
From: Kamezawa Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To: Bjorn Helgaas <bhelgaas@...gle.com>
CC: Wen Congyang <wency@...fujitsu.com>, rob@...dley.net,
tglx@...utronix.de, Ingo Molnar <mingo@...hat.com>, x86@...nel.org,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2 v2] x86: add max_addr boot option
(2012/06/12 20:30), Bjorn Helgaas wrote:
> On Mon, Jun 11, 2012 at 11:29 PM, Wen Congyang<wency@...fujitsu.com> wrote:
>> At 06/12/2012 01:35 AM, Bjorn Helgaas Wrote:
>>> On Mon, Jun 11, 2012 at 1:44 AM, Wen Congyang<wency@...fujitsu.com> wrote:
>>>> Currently, the boot option max_addr is only supported on ia64 platform.
>>>> We also need it on x86 platform.
>>>> For example:
>>>> There are two nodes:
>>>> NODE#0 address range 0x00000000 00000000 - 0x00010000 00000000
>>>> NODE#1 address range 0x00010000 00000000 - 0x00020000 00000000
>>>> If we only want to use node0, we can specify the max_addr. The boot
>>>> option "mem=" can do the same thing now. But the boot option "mem="
>>>> means the total memory used by the system. If we tell the user
>>>> that the boot option "mem=" can do this, it will confuse the user.
>>>> So we need an new boot option "max_addr" on x86 platform.
>>>
>>> I don't object to this patch (and thanks for tweaking the mem range printk).
>>>
>>> I don't know what your use case is, but from a user interface
>>> perspective, the "max_addr=" option feels like a bit of a hack. If
>>> you're trying to avoid use of other nodes, "max_addr" is an awkward
>>> way to do it. It requires the user to know the physical address ->
>>> node mappings, and it doesn't affect the CPUs and I/O resources on
>>> other nodes. You could implement a "numa_node=" or similar parameter
>>> that would allow you to ignore remote memory, CPUs, and I/O.
>>
>> Currently, I only need to ignore the memory. If we need to ignore a node,
>> "numa_node=" or similar parameter is a better choice.
>
> Doesn't the end user have to know the memory map of the system to use
> "max_addr="? How do you know what value to supply? Do you have to
> attempt a boot once to discover the highest address on node 0? What
> if node 0 and node 1 memory are interleaved, so there's some node 1
> memory below the highest node 0 address?
>
Current our plan is to avoid asking end-user to fix their boot option by hand
even if memory size per node is changed. We'll ship a hardware, which has
_fixed_ physical address range per each node regardless of equipped memory size.
The address will be written in Hardware manual or we'll ship some tool with hardware.
Of course, we disable interleave between nodes.
IIUC, memory layout can be changed because hardware error detection logic can
turn off DIMM before boot. So, if we use memmap=, which requires precise memory
mapping knowledge, the system admin need to modify it when the problem happens.
Problem happens => reboot (disable some DIMM) => remove memmap= option for avoiding
trouble => check memory layout again =>fix mem_map= => reboot again.
This reboot takes much time because the system which have Dynamic-partitioning tends to
be big....so, we'd like to have some _relaxed_ way to specify the region of memory.
Problem happens => reboot (disable some DIMM) => no changes required
(because we have enough memory hole between Node0 and Node1.)
BTW, how do you think about mem= boot option which works as max_addr=, now ?
This caused troubles some times on our support-desk, saying
Q. I specified mem=8G boot option but it seems the system has only 7GB....
A. it's because of PCI configuration area on 3G-4G address range...
Even if our requirement can be covered current mem= option, I'd like to have
max_addr= option and make mem= option to be sane as ia64.
Thanks,
-Kame
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists