[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2F8F36EB-D2D8-4DCE-910C-C56FFCEED3BF@numascale.com>
Date: Fri, 28 Aug 2015 06:42:07 +0000
From: Steffen Persvold <sp@...ascale.com>
To: Yinghai Lu <yinghai@...nel.org>
CC: x86 <x86@...nel.org>, LKML <linux-kernel@...r.kernel.org>
Subject: Re: CONFIG_HOLES_IN_ZONE and memory hot plug code on x86_64
On 27/08/15 22:20 , "yhlu.kernel@...il.com on behalf of Yinghai Lu" <yhlu.kernel@...il.com on behalf of yinghai@...nel.org> wrote:
>On Fri, Jun 26, 2015 at 4:31 PM, Steffen Persvold <sp@...ascale.com> wrote:
>> We’ve encountered an issue in a special case where we have a sparse E820 map [1].
>>
>> Basically the memory hotplug code is causing a “kernel paging request” BUG [2].
>
>the trace does not look like hotplug path.
>
>>
>> By instrumenting the function register_mem_sect_under_node() in drivers/base/node.c we see that it is called two times with the same struct memory_block argument :
>>
>> [ 1.901463] register_mem_sect_under_node: start = 80, end = 8f, nid = 0
>> [ 1.908129] register_mem_sect_under_node: start = 80, end = 8f, nid = 1
>
>Can you post whole log with SRAT related info?
I can probably reproduce again and get full logs when I get run time on the system again, but here’s some output that we saved in our internal Jira case :
[ 0.000000] NUMA: Initialized distance table, cnt=6
[ 0.000000] NUMA: Node 0 [mem 0x00000000-0x0009ffff] + [mem 0x00100000-0xd7ffffff] -> [mem 0x00000000-0xd7ffffff]
[ 0.000000] NUMA: Node 0 [mem 0x00000000-0xd7ffffff] + [mem 0x100000000-0x427ffffff] -> [mem 0x00000000-0x427ffffff]
[ 0.000000] NODE_DATA(0) allocated [mem 0x407fe3000-0x407ffffff]
[ 0.000000] NODE_DATA(1) allocated [mem 0x807fe3000-0x807ffffff]
[ 0.000000] NODE_DATA(2) allocated [mem 0xc07fe3000-0xc07ffffff]
[ 0.000000] NODE_DATA(3) allocated [mem 0x1007fe3000-0x1007ffffff]
[ 0.000000] NODE_DATA(4) allocated [mem 0x1407fe3000-0x1407ffffff]
[ 0.000000] NODE_DATA(5) allocated [mem 0x1807fdd000-0x1807ff9fff]
[ 0.000000] [ffffea0000000000-ffffea00101fffff] PMD -> [ffff8803f8600000-ffff880407dfffff] on node 0
[ 0.000000] [ffffea0010a00000-ffffea00201fffff] PMD -> [ffff8807f8600000-ffff880807dfffff] on node 1
[ 0.000000] [ffffea0020a00000-ffffea00301fffff] PMD -> [ffff880bf8600000-ffff880c07dfffff] on node 2
[ 0.000000] [ffffea0030a00000-ffffea00401fffff] PMD -> [ffff880ff8600000-ffff881007dfffff] on node 3
[ 0.000000] [ffffea0040a00000-ffffea00501fffff] PMD -> [ffff8813f8600000-ffff881407dfffff] on node 4
[ 0.000000] [ffffea0050a00000-ffffea00601fffff] PMD -> [ffff8817f7e00000-ffff8818075fffff] on node 5
If I remember correctly there was a mix of 4GB and 8GB DIMMs populated on this system. In addition the firmware reserved 512MByte at the end of each memory controllers physical range (hence the reserved ranges in the e820 map).
Note: this was with 4.1.0 vanilla so it could be obsolete now with 4.2-rc. I have not yet tested with your latest patches that you and Tony discussed.
Cheers,
Steffen
Powered by blists - more mailing lists