lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 10 Aug 2017 09:54:25 +0800
From:   Dou Liyang <douly.fnst@...fujitsu.com>
To:     YASUAKI ISHIMATSU <yasu.isimatu@...il.com>,
        Baoquan He <bhe@...hat.com>
CC:     Chao Fan <fanc.fnst@...fujitsu.com>,
        <linux-kernel@...r.kernel.org>, <x86@...nel.org>,
        <tglx@...utronix.de>, <mingo@...hat.com>, <hpa@...or.com>,
        <keescook@...omium.org>, <dyoung@...hat.com>, <arnd@...db.de>,
        <dave.jiang@...el.com>, <indou.takao@...fujitsu.com>,
        <izumi.taku@...fujitsu.com>
Subject: Re: [PATCH] x86/boot/KASLR: Extend movable_node option for KASLR

Hi, YASUAKI

At 08/10/2017 12:55 AM, YASUAKI ISHIMATSU wrote:
>
>
> On 08/09/2017 10:44 AM, Dou Liyang wrote:
>>
>> Hi YASUAKI,
>>
>> [...]
>>>>>>>>
>>>>>>>> we boot up kernel with 4 node:
>>>>>>>>
>>>>>>>> node 0 size: 1024 MB  immovable
>>>>>>>> node 1 size: 1024 MB  movable
>>>>>>>> node 2 size: 1024 MB  movable
>>>>>>>> node 3 size: 1024 MB  movable
>>>>>>>>
>>>>>>>> If we use "mem=1024M" in the command line, we just can use 1G memory.
>>>>>>>> But actually, we should have 4G normally.
>>>>>>>
>>>>>>> So do you have assumption on the order of immovable nodes and movable
>>>>>>> nodes? E.g above your example of nodes, immovable nodes have to be the
>>>>>>> lowest address. Is this required by the current hot-plug memory code?
>>>>>>>
>>>>>>
>>>>>> Wow! So great, It seems this is required by the hot-plug memory code.
>>>>>>
>>>>>> yesterday, I tested the patch in Qemu with 4 node and each time I
>>>>>> used different node as immovable node. But no matter what node I used,
>>>>>> the immovable nodes always had the lowest address.
>>>>>>
>>>>>> I am not familiar with memory, I am investigating this and I am going
>>>>>> to apply for a physical machine with movable nodes to check. :)
>>>>>
>>>>
>>>> Cc YASUAKI ISHIMATSU
>>>>
>>>> could you give us some help!
>>>>
>>>>> Great, thanks for your effort. I asked because this question confuses me
>>>>> and I know FJ ever focusd on the memory hot-plug implementation and
>>>>> continue working on that, it must be easier for you to consult your
>>>>> co-workers who ever worked on this. For normal kernel, seems it has
>>>>> to be that normal zone is on immovable node, namely node0. But what if
>>>>> people modified bootloader to locate kernel onto the last node and
>>>>> configure efi firmware to make the last node un-hot-plugable? I believe
>>>>> both of these can be done. Is this allowed? memory hot-plug has a
>>>>> requirement about the order of immovable node? And how many immovable
>>>>> nodes can we have? I have an slides FJ published, didn't find info about
>>>>> these.
>>>
>>> I read your patch. And I think what Baoquan wrote is right. The patch does
>>> care of only your server. As he wrote, if a server wants to build immovable
>>> node onto last node, the patch cannot handle such configuration.
>>>
>>
>> Thanks for your reviewing. it is reasonable. I will keep in my mind.
>>
>> But, I am not sure that  when we boot up a system with the following 4
>> nodes, does the BOIS(ACPI firmware) map the immovable node RAM from the
>> lowest address first?
>>
>> node 0 size: 1024 MB  immovable
>> node 1 size: 1024 MB  movable
>> node 2 size: 1024 MB  movable
>> node 3 size: 1024 MB  immovable
>>
>> the order of the physical RAM maps may be node 0, 3, 1, 2.
>
>
> It depends on SRAT table. If system boots up with movable_node, kernel checks
> hot pluggable bit of memory affinity structure in SRAT table. And if hot pluggable
> bit is set, the memory will be movable. If not set, the memory will be immovable.
>
> If memory affinity structures in SRAT table are defined as follows, the system
> sets up the configuration you mentioned.
>
> PXM: start       : end         : hot pluggable bit
>   0:0x00000000000:0x0ffffffffff: disable
>   1:0x10000000000:0x1ffffffffff: enable
>   2:0x30000000000:0x2ffffffffff: enable
>   3:0x40000000000:0x3ffffffffff: disable
>
> We are not sure there is such server. But there is no specification that immovable
> node has to be set from lowest address. So kernel should care of such SRAT table.
>

Yes, this patch didn't consider this situation.

It's related to the ACPI table. As I know when the ACPI firmware
generates the local APIC entries in MADT, it generates enabled CPUs
first and then disabled one(will be hot-plugged). I don't know whether
this stratagem is also used in SRAT or not.

I will validate the generation order of memory affinity structures in
ACPI SRAT. Then modify this patch.


Thanks,
	dou.

> Thanks,
> Yasuaki Ishimatsuu
>
>>
>>
>> Thanks,
>>
>>     dou,
>>
>>> Thanks,
>>> Yasuaki Ishimatsu
>>>
>>>>>
>>>>
>>>> Thanks,
>>>>     dou.
>>>>
>>>>>>
>>>>>>>>
>>>>>>>> Above is also one reason for why not using 'mem=' directly. Following
>>>>>>>> is other reasons:
>>>>>>>>
>>>>>>>> 1). each kernel option has its own role, we'd better misuse them.
>>>>>>>> 2). movable_node is used as a boot-time switch to make nodes movable
>>>>>>>> or not, it should consider any situations, such as KASLR.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>     dou.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Aug 03, 2017 at 08:17:21PM +0800, Dou Liyang wrote:
>>>>>>>>>>> movable_node is a boot-time switch to make hot-pluggable memory
>>>>>>>>>>> NUMA nodes to be movable. This option is based on an assumption
>>>>>>>>>>> that any node which the kernel resides in is defined as
>>>>>>>>>>> un-hotpluggable. Linux can allocates memory near the kernel image
>>>>>>>>>>> to try the best to keep the kernel away from hotpluggable memory
>>>>>>>>>>> in the same NUMA node. So other nodes can be movable.
>>>>>>>>>>>
>>>>>>>>>>> But, KASLR doesn't know which node is un-hotpluggable, the all
>>>>>>>>>>> hotpluggable memory ranges is recorded in ACPI SRAT table, SRAT
>>>>>>>>>>> is not parsed. So, KASLR may randomize the kernel in a movable
>>>>>>>>>>> node which will be immovable.
>>>>>>>>>>>
>>>>>>>>>>> Extend movable_node option to restrict kernel to be randomized in
>>>>>>>>>>> immovable nodes by adding a parameter. this parameter sets up
>>>>>>>>>>> the boundaries between the movable nodes and immovable nodes.
>>>>>>>
>>>>>>> And here you mentioned boundaries, means not only one boundary, so how
>>>>>>> do you handle the case movable nodes and immovable nodes alternate to be
>>>>>>> placed?
>>>>>>>
>>>>>>> I mean, are you sure the current hot-plug memory code require immovable
>>>>>>> node has to be the first node and there's only one immovable node or
>>>>>>> there are several immovable node but they are the first few nodes?
>>>>>>>
>>>>>>> If yes, then this patch looks good to me, I would like to ack it.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Baoquan
>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Reported-by: Chao Fan <fanc.fnst@...fujitsu.com>
>>>>>>>>>>> Signed-off-by: Dou Liyang <douly.fnst@...fujitsu.com>
>>>>>>>>>>> ---
>>>>>>>>>>> Documentation/admin-guide/kernel-parameters.txt | 11 +++++++++--
>>>>>>>>>>> arch/x86/boot/compressed/kaslr.c                | 19 ++++++++++++++++---
>>>>>>>>>>> 2 files changed, 25 insertions(+), 5 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>>>>>>>>>> index d9c171c..44c7e33 100644
>>>>>>>>>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>>>>>>>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>>>>>>>>>> @@ -2305,7 +2305,8 @@
>>>>>>>>>>>     mousedev.yres=    [MOUSE] Vertical screen resolution, used for devices
>>>>>>>>>>>             reporting absolute coordinates, such as tablets
>>>>>>>>>>>
>>>>>>>>>>> -    movablecore=nn[KMG]    [KNL,X86,IA-64,PPC] This parameter
>>>>>>>>>>> +    movablecore=nn[KMG]
>>>>>>>>>>> +            [KNL,X86,IA-64,PPC] This parameter
>>>>>>>>>>>             is similar to kernelcore except it specifies the
>>>>>>>>>>>             amount of memory used for migratable allocations.
>>>>>>>>>>>             If both kernelcore and movablecore is specified,
>>>>>>>>>>> @@ -2315,12 +2316,18 @@
>>>>>>>>>>>             that the amount of memory usable for all allocations
>>>>>>>>>>>             is not too small.
>>>>>>>>>>>
>>>>>>>>>>> -    movable_node    [KNL] Boot-time switch to make hotplugable memory
>>>>>>>>>>> +    movable_node    [KNL] Boot-time switch to make hot-pluggable memory
>>>>>>>>>>>             NUMA nodes to be movable. This means that the memory
>>>>>>>>>>>             of such nodes will be usable only for movable
>>>>>>>>>>>             allocations which rules out almost all kernel
>>>>>>>>>>>             allocations. Use with caution!
>>>>>>>>>>>
>>>>>>>>>>> +    movable_node=nn[KMG]
>>>>>>>>>>> +            [KNL] Extend movable_node to work well with KASLR. This
>>>>>>>>>>> +            parameter is the boundaries between the movable nodes
>>>>>>>>>>> +            and immovable nodes, the memory which exceeds it will
>>>>>>>>>>> +            be regarded as hot-pluggable.
>>>>>>>>>>> +
>>>>>>>>>>>     MTD_Partition=    [MTD]
>>>>>>>>>>>             Format: <name>,<region-number>,<size>,<offset>
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>>>>>>>>>>> index 91f27ab..7e2351b 100644
>>>>>>>>>>> --- a/arch/x86/boot/compressed/kaslr.c
>>>>>>>>>>> +++ b/arch/x86/boot/compressed/kaslr.c
>>>>>>>>>>> @@ -89,7 +89,10 @@ struct mem_vector {
>>>>>>>>>>> static bool memmap_too_large;
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -/* Store memory limit specified by "mem=nn[KMG]" or "memmap=nn[KMG]" */
>>>>>>>>>>> +/*
>>>>>>>>>>> + * Store memory limit specified by the following situations:
>>>>>>>>>>> + * "mem=nn[KMG]" or "memmap=nn[KMG]" or "movable_node=nn[KMG]"
>>>>>>>>>>> + */
>>>>>>>>>>> unsigned long long mem_limit = ULLONG_MAX;
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> @@ -212,7 +215,8 @@ static int handle_mem_memmap(void)
>>>>>>>>>>>     char *param, *val;
>>>>>>>>>>>     u64 mem_size;
>>>>>>>>>>>
>>>>>>>>>>> -    if (!strstr(args, "memmap=") && !strstr(args, "mem="))
>>>>>>>>>>> +    if (!strstr(args, "memmap=") && !strstr(args, "mem=") &&
>>>>>>>>>>> +        !strstr(args, "movable_node="))
>>>>>>>>>>>         return 0;
>>>>>>>>>>>
>>>>>>>>>>>     tmp_cmdline = malloc(len + 1);
>>>>>>>>>>> @@ -247,7 +251,16 @@ static int handle_mem_memmap(void)
>>>>>>>>>>>                 free(tmp_cmdline);
>>>>>>>>>>>                 return -EINVAL;
>>>>>>>>>>>             }
>>>>>>>>>>> -            mem_limit = mem_size;
>>>>>>>>>>> +            mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>>>>>>>>>> +        } else if (!strcmp(param, "movable_node")) {
>>>>>>>>>>> +            char *p = val;
>>>>>>>>>>> +
>>>>>>>>>>> +            mem_size = memparse(p, &p);
>>>>>>>>>>> +            if (mem_size == 0) {
>>>>>>>>>>> +                free(tmp_cmdline);
>>>>>>>>>>> +                return -EINVAL;
>>>>>>>>>>> +            }
>>>>>>>>>>> +            mem_limit = mem_limit > mem_size ? mem_size : mem_limit;
>>>>>>>>>>>         }
>>>>>>>>>>>     }
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> 2.5.5
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>
>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ