lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cb23761b-6800-f387-b302-c02f0cced2d0@cn.fujitsu.com>
Date:   Mon, 6 Mar 2017 10:11:36 +0800
From:   Dou Liyang <douly.fnst@...fujitsu.com>
To:     <mingo@...nel.org>, <tglx@...utronix.de>, <hpa@...or.com>,
        <rjw@...ysocki.net>, <lenb@...nel.org>, <xiaolong.ye@...el.com>,
        <guzheng1@...wei.com>, <izumi.taku@...fujitsu.com>
CC:     <x86@...nel.org>, <linux-acpi@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 0/5] Do repair works for the mapping of cpuid <->
 nodeid



At 03/03/2017 04:32 PM, Dou Liyang wrote:
> Hi All,
>
> My Simple Test Result:
>
> In our box: Fujitsu PQ2000 with 1 nodes for hot-plug.

s/1 nodes/2 nodes in 1 SB which contains CPU, Memory.../

Thanks,
	Liyang


>
> Before the patchset:
>
> +-------------------------------------+
> |                                     |
> |  NUMA node0 CPU:    0-23,256-279    +------+
> |  NUMA node1 CPU:    24-47,280-303   |      |
> |                                     |      |
> +-------------------------------------+      |
>                                          Hot-plug
> +-------------------------------------+      +
> |                                     |      |
> |  NUMA Node0: 0-23, 256-279          <------+
> |  NUMA Node1: 24-47, 280-303         |
> |  NUMA Node2: 64|69, 72-77, 80-85, 88-93...
> |  NUMA Node3: 96-101, 104-109, 112-117,...
> |                                     |      |
> +-------------------------------------+      |
>                                          Hot-remove
> +-------------------------------------+      |
> |                                     |      |
> |  NUMA node0 CPU:    0-23,256-279    |      |
> |  NUMA node1 CPU:    24-47,280-303   +^-----+
> |                                     |
> |                                     |
> +-------------------------------------+
>
> After the patchset:
>
> +-------------------------------------+
> |                                     |
> |  NUMA node0 CPU:    0-23,48-71      +------+
> |  NUMA node1 CPU:    24-47,72-95     |      |
> |                                     |      |
> +-------------------------------------+      |
>                                          Hot-plug
> +-------------------------------------+      +
> |                                     |      |
> |  NUMA node0 CPU:    0-23,48-71      <------+
> |  NUMA node1 CPU:    24-47,72-95     |
> |  NUMA node2 CPU:    96-143          +------+
> |  NUMA node3 CPU:    144-191         |      |
> |                                     |      |
> +-------------------------------------+      |
>                                          Hot-remove
> +-------------------------------------+      |
> |                                     |      |
> |  NUMA node0 CPU:    0-23,48-71      |      |
> |  NUMA node1 CPU:    24-47,72-95     +^-----+
> |                                     |
> |                                     |
> +-------------------------------------+
>
> And I also test some cases in VMs with QEmu.
>
> And When I get more nodes, I will test the whole
> function.
>
> Thanks,
> Liyang.
>
> At 03/03/2017 04:02 PM, Dou Liyang wrote:
>> [Summary]:
>>
>> 1, Revert two commits
>> 2, Fix the order of Logical CPU IDs
>> 3, Move the validation of processor IDs to hot-plug time.
>>
>> The mapping of "cpuid <-> nodeid" is established at boot time via ACPI
>> tables to keep associations of workqueues and other node related items
>> consistent across cpu hotplug as following:
>>
>> Step 1. Make the "Logical CPU ID <-> Processor ID/UID" fixed Using MADT:
>> We generate the logical CPU IDs by the Local APIC/x2APIC IDs orderly and
>> get the mapping of Processor ID/UID <-> Local Apic ID directly in MADT.
>> So, we get the mapping of
>> *Processor ID/UID <-> Local Apic ID <-> Logical CPU ID*
>>
>> Step 2. Make the "Processor ID/UID <-> Node ID(_PXM)" fixed Using DSDT:
>> The maaping of "Processor ID/UID <-> Node ID(_PXM)" is ready-made in
>> each entities. we just use it directly.
>>
>> But, ACPI tables are unreliable and failures with that boot time mapping
>> have been reported on machines where the ACPI table and the physical
>> information which is retrieved at actual hotplug is inconsistent. Here
>> has already two bugs we found:
>>
>> 1. Duplicated Processor IDs in DSDT.
>>     It has been fixed by commits:
>>     '8e089eaa1999 ("acpi: Provide mechanism to validate processors
>> in the ACPI tables")' and 'fd74da217df7 ("acpi: Validate processor id
>> when mapping the processor")'
>>
>> 2. The _PXM in DSDT is inconsistent with the one in MADT.
>>     It may cause the bug, which is shown in:
>>         https://lkml.org/lkml/2017/2/12/200
>>
>> And one phenomenon is happened in some specific boxes:
>>
>> 1. The logical CPU IDs is discrete. Such as:
>>     Node2: 64-69, 72-77, 80-85, 88-93,...
>>
>> There may be more strange things happened in the futher. We shouldn't
>> just
>> only fix them everytime, we should solve this problem from the source to
>> avoid such problems happened again and again.
>>
>> Find a simple and easy way:
>>
>> 1. Do the step 1 when the CPU flag is enabled
>> 2. Do the step 2 at hot-plug time, not at boot time when we did some
>> useless work.
>>
>> It also can make the mapping of "cpuid <-> nodeid" fixed and avoid
>> excessive using of the ACPI tables.
>>
>> Change log:
>>   v2 -> v3: 1. rewirte the changelogs
>>         copy the changelogs Thomas Gleixner <tglx@...utronix.de>
>>         rewrite for the patch 1,2,4,5.
>>             2. s/duplicate_processor_id()/acpi_duplicate_processor_id().
>>         by Thomas Gleixner <tglx@...utronix.de>'s advice.
>>             3. modify the error handle in acpi_processor_ids_walk()
>>         by Thomas Gleixner <tglx@...utronix.de>'s advice.
>>             4. add a new patch for restoring the order of CPU IDs
>>
>>   v1 -> v2: 1. fix some comments.
>>             2. add the verification of duplicate processor id.
>>
>> Dou Liyang (5):
>>   Revert"x86/acpi: Set persistent cpuid <-> nodeid mapping when booting"
>>   Revert"x86/acpi: Enable MADT APIs to return disabled apicids"
>>   x86/acpi: Restore the order of CPU IDs
>>   acpi/processor: Implement DEVICE operator for processor enumeration
>>   acpi/processor: Check for duplicate processor ids at hotplug time
>>
>>  arch/x86/kernel/acpi/boot.c   |   9 ++-
>>  arch/x86/kernel/apic/apic.c   |  26 +++------
>>  drivers/acpi/acpi_processor.c |  57 +++++++++++++-----
>>  drivers/acpi/bus.c            |   1 -
>>  drivers/acpi/processor_core.c | 133
>> +++++++-----------------------------------
>>  include/linux/acpi.h          |   5 +-
>>  6 files changed, 79 insertions(+), 152 deletions(-)
>>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ