lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47B748EE.8020100@sgi.com>
Date:	Sat, 16 Feb 2008 12:34:54 -0800
From:	Mike Travis <travis@....com>
To:	Mel Gorman <mel@....ul.ie>
CC:	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, mingo@...e.hu, tglx@...utronix.de,
	Christoph Lameter <clameter@....com>,
	Jack Steiner <steiner@....com>
Subject: Re: 2.6.24 git2/mm1: cpu_to_node mapping to non-existant nodes causing
 boot failure

Well, after lot's of aggravation I finally got the dusty old
numa box to boot up, and the upstream linux-2.6 kernel works fine.
(I've attached the startup log showing 8 cores on 4 nodes.)

The big difference appears in where memory is located.  Your
box has all the memory on node 0 where my box has memory on
all the nodes...

>>From the working log:

SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 0 -> APIC 1 -> Node 0
SRAT: PXM 1 -> APIC 2 -> Node 1
SRAT: PXM 1 -> APIC 3 -> Node 1
SRAT: PXM 2 -> APIC 4 -> Node 2
SRAT: PXM 2 -> APIC 5 -> Node 2
SRAT: PXM 3 -> APIC 6 -> Node 3
SRAT: PXM 3 -> APIC 7 -> Node 3
SRAT: Node 0 PXM 0 0-a0000
SRAT: Node 0 PXM 0 0-e4000000
SRAT: Node 0 PXM 0 0-200000000
SRAT: Node 1 PXM 1 200000000-400000000
SRAT: Node 2 PXM 2 400000000-600000000
SRAT: Node 3 PXM 3 600000000-800000000

>>From the failing log:

SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 0 -> APIC 1 -> Node 0
SRAT: PXM 1 -> APIC 2 -> Node 1
SRAT: PXM 1 -> APIC 3 -> Node 1
SRAT: Node 0 PXM 0 0-40000000

The message that cpu's 2 & 3 having no node is therefore misleading, it
should say that cpu's 2 & 3 have no "node local memory".  But other
than that, it should allocate the PERCPU memory on node 0 and
everything's fine.

Apparently then the only way to debug this is to fake a setup where
some cpus have no node local memory and go from there.  Unfortunately,
the box I'm testing on has no working remote access.  I'll see if I
can't fake it out on a non-numa box until the lab is back open on Tuesday.

Thanks,
Mike

Mel Gorman wrote:
> On (14/02/08 12:41), Mike Travis didst pronounce:
>> Mel Gorman wrote:
...
>>>>
>>> According to git-bisect, the problem patch is below. It doesn't back out
>>> cleanly so I haven't verified for sure the bisect is correct yet.
>> This might make sense.  This code is in preparation for the extended
>> apic's available on the new processors.  I've tested the code with
>> our simulator (with no errors) and I'm setting up to test on a real
>> machine that has multiple numa nodes.  I wonder if maybe BIOS is not
>> providing correct node data, or the ACPI parsing is in error?  You
>> might try adding "apic=debug" to the boot command line.
>>
> 
> I tried this, but the dmesg complained about a malformed option. I'll
> check out why tomorrow but it didn't appear particularly helpful.
> 
>> For the short term, we can remove this patch if it's causing the
>> problem.  A more complete patch will be available soon that contains
>> the entire set of x2apic changes.
>>
> 
> If you send me patches to apply on top of 2.6.25-rc1, I'll give them a spin
> on the machine in question. Reverting didn't work out very well as there are
> too many collisions with patches that were applied later. I eventually got
> the machine booting but it only succeeds because it only brings up one core
> on each processor.  The patch, which is pretty brain damaged is below in case
> it helps you guess what the real problem is. dmesg logs are attached of the
> vanilla failure with acpi=debug and the log with the patch applied showing
> "__cpu_up: bad cpu 1" and "__cpu_up: bad cpu3" (i.e. the second cores of
> each machine).
> 
> 
> diff -ru linux-2.6/arch/x86/kernel/genapic_64.c linux-2.6-working/arch/x86/kernel/genapic_64.c
> --- linux-2.6/arch/x86/kernel/genapic_64.c	2008-02-14 16:32:55.000000000 -0600
> +++ linux-2.6-working/arch/x86/kernel/genapic_64.c	2008-02-14 15:46:18.000000000 -0600
> @@ -25,10 +25,10 @@
>  #endif
>  
>  /* which logical CPU number maps to which CPU (physical APIC ID) */
> -u16 x86_cpu_to_apicid_init[NR_CPUS] __initdata
> +u8 x86_cpu_to_apicid_init[NR_CPUS] __initdata
>  					= { [0 ... NR_CPUS-1] = BAD_APICID };
>  void *x86_cpu_to_apicid_early_ptr;
> -DEFINE_PER_CPU(u16, x86_cpu_to_apicid) = BAD_APICID;
> +DEFINE_PER_CPU(u8, x86_cpu_to_apicid) = BAD_APICID;
>  EXPORT_PER_CPU_SYMBOL(x86_cpu_to_apicid);
>  
>  struct genapic __read_mostly *genapic = &apic_flat;
> diff -ru linux-2.6/arch/x86/kernel/mpparse_64.c linux-2.6-working/arch/x86/kernel/mpparse_64.c
> --- linux-2.6/arch/x86/kernel/mpparse_64.c	2008-02-14 16:32:55.000000000 -0600
> +++ linux-2.6-working/arch/x86/kernel/mpparse_64.c	2008-02-14 15:45:44.000000000 -0600
> @@ -67,7 +67,7 @@
>  /* Bitmask of physically existing CPUs */
>  physid_mask_t phys_cpu_present_map = PHYSID_MASK_NONE;
>  
> -u16 x86_bios_cpu_apicid_init[NR_CPUS] __initdata
> +u8 x86_bios_cpu_apicid_init[NR_CPUS] __initdata
>  				= { [0 ... NR_CPUS-1] = BAD_APICID };
>  void *x86_bios_cpu_apicid_early_ptr;
>  DEFINE_PER_CPU(u16, x86_bios_cpu_apicid) = BAD_APICID;
> diff -ru linux-2.6/include/asm-x86/smp_64.h linux-2.6-working/include/asm-x86/smp_64.h
> --- linux-2.6/include/asm-x86/smp_64.h	2008-02-14 16:33:04.000000000 -0600
> +++ linux-2.6-working/include/asm-x86/smp_64.h	2008-02-14 15:43:01.000000000 -0600
> @@ -26,15 +26,16 @@
>  extern int smp_call_function_mask(cpumask_t mask, void (*func)(void *),
>  				  void *info, int wait);
>  
> -extern u16 __initdata x86_cpu_to_apicid_init[];
> -extern u16 __initdata x86_bios_cpu_apicid_init[];
> +extern u8 __initdata x86_cpu_to_apicid_init[];
> +extern u8 __initdata x86_bios_cpu_apicid_init[];
>  extern void *x86_cpu_to_apicid_early_ptr;
>  extern void *x86_bios_cpu_apicid_early_ptr;
> +DECLARE_PER_CPU(u8, x86_cpu_to_apicid); /* physical ID */
> +extern u8 bios_cpu_apicid[];
>  
>  DECLARE_PER_CPU(cpumask_t, cpu_sibling_map);
>  DECLARE_PER_CPU(cpumask_t, cpu_core_map);
>  DECLARE_PER_CPU(u16, cpu_llc_id);
> -DECLARE_PER_CPU(u16, x86_cpu_to_apicid);
>  DECLARE_PER_CPU(u16, x86_bios_cpu_apicid);
>  
>  static inline int cpu_present_to_apicid(int mps_cpu)
> 
> 



View attachment "test-2.txt" of type "text/plain" (49514 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ