lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 24 Aug 2007 09:58:47 +0100
From:	mel@...net.ie (Mel Gorman)
To:	Kamalesh Babulal <kamalesh@...ux.vnet.ibm.com>
Cc:	Christoph Lameter <clameter@....com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org,
	Balbir Singh <balbir@...ux.vnet.ibm.com>, linux-mm@...ck.org
Subject: Re: [BUG] 2.6.23-rc3-mm1 kernel BUG at mm/page_alloc.c:2876!

On (24/08/07 11:45), Kamalesh Babulal didst pronounce:
> Christoph Lameter wrote:
> >On Thu, 23 Aug 2007, Kamalesh Babulal wrote:
> >
> >  
> >>After applying the patch, the call trace is gone but the kernel bug
> >>is still hit
> >>    
> >
> >Yes that is what we expected. We need more information to figure out why 
> >the kmalloc_node fails there. It should walk through all nodes to find 
> >memory.
> >
> >I see that you have 4 cpus and 16 nodes. How are the cpus assigned to 
> >nodes? If a cpu would be assigned to a nonexisting node then this could be 
> >the result.
> >
> >Could you post the full boot log?
> >
> >  
> boot log with the andrew patch applied
> 
> Welcome to yaboot version 1.3.13
> Enter "help" to get some basic usage information
> boot: autobench
> Please wait, loading kernel...
> Elf64 kernel loaded...
> Loading ramdisk...
> ramdisk loaded at 02400000, size: 1191 Kbytes
> OF stdout device is: /vdevice/vty@...00000
> Hypertas detected, assuming LPAR !
> command line: ro console=hvc0 autobench_args: root=/dev/sda6 
> ABAT:1187885681
> memory layout at init:
> alloc_bottom : 000000000252a000
> alloc_top : 0000000008000000
> alloc_top_hi : 0000000100000000
> rmo_top : 0000000008000000
> ram_top : 0000000100000000
> Looking for displays
> instantiating rtas at 0x00000000077d9000 ... done
> 0000000000000000 : boot cpu 0000000000000000
> 0000000000000002 : starting cpu hw idx 0000000000000002... done
> copying OF device tree ...
> Building dt strings...
> Building dt structure...
> Device tree strings 0x000000000262b000 -> 0x000000000262c1d3
> Device tree struct 0x000000000262d000 -> 0x0000000002635000
> Calling quiesce ...
> returning from prom_init
> Partition configured for 4 cpus.
> 
> 
> Starting Linux PPC64 #1 SMP Thu Aug 23 11:54:44 EDT 2007
> -----------------------------------------------------
> ppc64_pft_size = 0x1a
> physicalMemorySize = 0x100000000
> ppc64_caches.dcache_line_size = 0x80
> ppc64_caches.icache_line_size = 0x80
> htab_address = 0x0000000000000000
> htab_hash_mask = 0x7ffff
> -----------------------------------------------------
> Linux version 2.6.23-rc3-mm1-autokern1 
> (root@...ko-lp3.ltc.austin.ibm.com) (gcc version 3.4.6 20060404 (Red Hat 
> 3.4.6-3)) #1 SMP Thu Aug 23 11:54:44 EDT 2007
> [boot]0012 Setup Arch
> vmemmap cf00000000000000 allocated at c000000001000000, physical 
> 0000000001000000.
> vmemmap cf00000001000000 allocated at c000000004000000, physical 
> 0000000004000000.
> vmemmap cf00000002000000 allocated at c000000005000000, physical 
> 0000000005000000.
> vmemmap cf00000003000000 allocated at c000000006000000, physical 
> 0000000006000000.
> EEH: PCI Enhanced I/O Error Handling Enabled
> PPC64 nvram contains 7168 bytes
> Zone PFN ranges:
> DMA 0 -> 1048576
> Normal 1048576 -> 1048576
> Movable zone start PFN for each node
> early_node_map[1] active PFN ranges
> 2: 0 -> 1048576
> Could not find start_pfn for node 0
> [boot]0015 Setup Done
> Built 2 zonelists in Node order, mobility grouping off. Total pages: 0

This indicates to me that the zonelists are trashed. All memory is on
zone 2 according to early_node_map[] and the CPU is most likely part of
node 0 that doesn't have a proper fallback list

> Policy zone: DMA
> Kernel command line: ro console=hvc0 autobench_args: root=/dev/sda6 
> ABAT:1187885681
> [boot]0020 XICS Init
> [boot]0021 XICS Done
> PID hash table entries: 4096 (order: 12, 32768 bytes)
> Console: colour dummy device 80x25
> console handover: boot [udbg0] -> real [hvc0]
> Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
> Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
> freeing bootmem node 2
> Memory: 4105840k/4194304k available (4964k kernel code, 88464k reserved, 
> 948k data, 571k bss, 264k init)
> SLUB: Genslabs=12, HWalign=128, Order=0-1, MinObjects=4, CPUs=4, Nodes=16
> ------------[ cut here ]------------
> kernel BUG at mm/page_alloc.c:2878!
> cpu 0x0: Vector: 700 (Program Check) at [c0000000005cbbe0]
> pc: c0000000004b5160: .setup_per_cpu_pageset+0x24/0x48
> lr: c0000000004b5160: .setup_per_cpu_pageset+0x24/0x48
> sp: c0000000005cbe60
> msr: 8000000000029032
> current = 0xc0000000004fd1b0
> paca = 0xc0000000004fdd80
> pid = 0, comm = swapper
> kernel BUG at mm/page_alloc.c:2878!
> enter ? for help
> [c0000000005cbee0] c0000000004978d8 .start_kernel+0x304/0x3f4
> [c0000000005cbf90] c0000000003bef1c .start_here_common+0x54/0x58
> 
> -
> Kamalesh Babulal.
> 
> 
> 

-- 
-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ