lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 3 Mar 2017 16:32:28 +0800
From:   Dou Liyang <douly.fnst@...fujitsu.com>
To:     <mingo@...nel.org>, <tglx@...utronix.de>, <hpa@...or.com>,
        <rjw@...ysocki.net>, <lenb@...nel.org>, <xiaolong.ye@...el.com>,
        <guzheng1@...wei.com>, <izumi.taku@...fujitsu.com>
CC:     <x86@...nel.org>, <linux-acpi@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 0/5] Do repair works for the mapping of cpuid <->
 nodeid

Hi All,

My Simple Test Result:

In our box: Fujitsu PQ2000 with 1 nodes for hot-plug.

Before the patchset:

+-------------------------------------+
|                                     |
|  NUMA node0 CPU:    0-23,256-279    +------+
|  NUMA node1 CPU:    24-47,280-303   |      |
|                                     |      |
+-------------------------------------+      |
                                          Hot-plug
+-------------------------------------+      +
|                                     |      |
|  NUMA Node0: 0-23, 256-279          <------+
|  NUMA Node1: 24-47, 280-303         |
|  NUMA Node2: 64|69, 72-77, 80-85, 88-93...
|  NUMA Node3: 96-101, 104-109, 112-117,...
|                                     |      |
+-------------------------------------+      |
                                          Hot-remove
+-------------------------------------+      |
|                                     |      |
|  NUMA node0 CPU:    0-23,256-279    |      |
|  NUMA node1 CPU:    24-47,280-303   +^-----+
|                                     |
|                                     |
+-------------------------------------+

After the patchset:

+-------------------------------------+
|                                     |
|  NUMA node0 CPU:    0-23,48-71      +------+
|  NUMA node1 CPU:    24-47,72-95     |      |
|                                     |      |
+-------------------------------------+      |
                                          Hot-plug
+-------------------------------------+      +
|                                     |      |
|  NUMA node0 CPU:    0-23,48-71      <------+
|  NUMA node1 CPU:    24-47,72-95     |
|  NUMA node2 CPU:    96-143          +------+
|  NUMA node3 CPU:    144-191         |      |
|                                     |      |
+-------------------------------------+      |
                                          Hot-remove
+-------------------------------------+      |
|                                     |      |
|  NUMA node0 CPU:    0-23,48-71      |      |
|  NUMA node1 CPU:    24-47,72-95     +^-----+
|                                     |
|                                     |
+-------------------------------------+

And I also test some cases in VMs with QEmu.

And When I get more nodes, I will test the whole
function.

Thanks,
Liyang.

At 03/03/2017 04:02 PM, Dou Liyang wrote:
> [Summary]:
>
> 1, Revert two commits
> 2, Fix the order of Logical CPU IDs
> 3, Move the validation of processor IDs to hot-plug time.
>
> The mapping of "cpuid <-> nodeid" is established at boot time via ACPI
> tables to keep associations of workqueues and other node related items
> consistent across cpu hotplug as following:
>
> Step 1. Make the "Logical CPU ID <-> Processor ID/UID" fixed Using MADT:
> We generate the logical CPU IDs by the Local APIC/x2APIC IDs orderly and
> get the mapping of Processor ID/UID <-> Local Apic ID directly in MADT.
> So, we get the mapping of
> *Processor ID/UID <-> Local Apic ID <-> Logical CPU ID*
>
> Step 2. Make the "Processor ID/UID <-> Node ID(_PXM)" fixed Using DSDT:
> The maaping of "Processor ID/UID <-> Node ID(_PXM)" is ready-made in
> each entities. we just use it directly.
>
> But, ACPI tables are unreliable and failures with that boot time mapping
> have been reported on machines where the ACPI table and the physical
> information which is retrieved at actual hotplug is inconsistent. Here
> has already two bugs we found:
>
> 1. Duplicated Processor IDs in DSDT.
> 	It has been fixed by commits:
> 	'8e089eaa1999 ("acpi: Provide mechanism to validate processors
> in the ACPI tables")' and 'fd74da217df7 ("acpi: Validate processor id
> when mapping the processor")'
>
> 2. The _PXM in DSDT is inconsistent with the one in MADT.
> 	It may cause the bug, which is shown in:
> 		https://lkml.org/lkml/2017/2/12/200
>
> And one phenomenon is happened in some specific boxes:
>
> 1. The logical CPU IDs is discrete. Such as:
> 	Node2: 64-69, 72-77, 80-85, 88-93,...
>
> There may be more strange things happened in the futher. We shouldn't just
> only fix them everytime, we should solve this problem from the source to
> avoid such problems happened again and again.
>
> Find a simple and easy way:
>
> 1. Do the step 1 when the CPU flag is enabled
> 2. Do the step 2 at hot-plug time, not at boot time when we did some
> useless work.
>
> It also can make the mapping of "cpuid <-> nodeid" fixed and avoid
> excessive using of the ACPI tables.
>
> Change log:
>   v2 -> v3: 1. rewirte the changelogs
> 		copy the changelogs Thomas Gleixner <tglx@...utronix.de>
> 		rewrite for the patch 1,2,4,5.
>             2. s/duplicate_processor_id()/acpi_duplicate_processor_id().
> 		by Thomas Gleixner <tglx@...utronix.de>'s advice.
>             3. modify the error handle in acpi_processor_ids_walk()
> 		by Thomas Gleixner <tglx@...utronix.de>'s advice.
>             4. add a new patch for restoring the order of CPU IDs
>
>   v1 -> v2: 1. fix some comments.
>             2. add the verification of duplicate processor id.
>
> Dou Liyang (5):
>   Revert"x86/acpi: Set persistent cpuid <-> nodeid mapping when booting"
>   Revert"x86/acpi: Enable MADT APIs to return disabled apicids"
>   x86/acpi: Restore the order of CPU IDs
>   acpi/processor: Implement DEVICE operator for processor enumeration
>   acpi/processor: Check for duplicate processor ids at hotplug time
>
>  arch/x86/kernel/acpi/boot.c   |   9 ++-
>  arch/x86/kernel/apic/apic.c   |  26 +++------
>  drivers/acpi/acpi_processor.c |  57 +++++++++++++-----
>  drivers/acpi/bus.c            |   1 -
>  drivers/acpi/processor_core.c | 133 +++++++-----------------------------------
>  include/linux/acpi.h          |   5 +-
>  6 files changed, 79 insertions(+), 152 deletions(-)
>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ