linux-kernel - Re: crash in kmem_cache

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <29495f1d0801221412s2642a24u6cb2ea8c9340aed2@mail.gmail.com>
Date:	Tue, 22 Jan 2008 14:12:06 -0800
From:	"Nish Aravamudan" <nish.aravamudan@...il.com>
To:	"Olaf Hering" <olaf@...fle.de>
Cc:	"Mel Gorman" <mel@....ul.ie>,
	"Christoph Lameter" <clameter@....com>,
	"Pekka Enberg" <penberg@...helsinki.fi>,
	linux-kernel@...r.kernel.org, linuxppc-dev@...abs.org,
	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
	"hanth Aravamudan" <nacc@...ibm.com>,
	"KAMEZAWA Hiroyuki" <kamezawa.hiroyu@...fujitsu.com>,
	lee.schermerhorn@...com, "Linux MM" <linux-mm@...ck.org>,
	akpm@...ux-foundation.org
Subject: Re: crash in kmem_cache_init

On 1/22/08, Olaf Hering <olaf@...fle.de> wrote:
> On Tue, Jan 22, Mel Gorman wrote:
>
> > http://www.csn.ul.ie/~mel/postings/slab-20080122/partial-revert-slab-changes.patch
> > .. Can you please check on your machine if it fixes your problem?
>
> It does not fix or change the nature of the crash.
>
> > Olaf, please confirm whether you need the patch below as well as the
> > revert to make your machine boot.
>
> It crashes now in a different way if the patch below is applied:

Was this with the revert Mel mentioned applied as well? I get the
feeling both patches are needed to fix up the memoryless SLAB issue.

> Linux version 2.6.24-rc8-ppc64 (olaf@...gonberry) (gcc version 4.1.2 20070115 (prerelease) (SUSE Linux)) #43 SMP Tue Jan 22 22:39:05 CET 2008

<snip>

> early_node_map[1] active PFN ranges
>     1:        0 ->   892928

<snip>

> Unable to handle kernel paging request for data at address 0x00000058
> Faulting instruction address: 0xc0000000000fe018
> cpu 0x0: Vector: 300 (Data Access) at [c00000000075bac0]
>     pc: c0000000000fe018: .setup_cpu_cache+0x184/0x1f4
>     lr: c0000000000fdfa8: .setup_cpu_cache+0x114/0x1f4
>     sp: c00000000075bd40
>    msr: 8000000000009032
>    dar: 58
>  dsisr: 42000000
>   current = 0xc000000000665a50
>   paca    = 0xc000000000666380
>     pid   = 0, comm = swapper
> enter ? for help
> [c00000000075bd40] c0000000000fb368 .kmem_cache_create+0x3c0/0x478 (unreliable)
> [c00000000075be20] c0000000005e6780 .kmem_cache_init+0x284/0x4f4
> [c00000000075bee0] c0000000005bf8ec .start_kernel+0x2f8/0x3fc
> [c00000000075bf90] c000000000008590 .start_here_common+0x60/0xd0
> 0:mon>
>
> 0xc0000000000fe018 is in setup_cpu_cache (/home/olaf/kernel/git/linux-2.6-numa/mm/slab.c:2111).
> 2106                                    BUG_ON(!cachep->nodelists[node]);
> 2107                                    kmem_list3_init(cachep->nodelists[node]);

I might be barking up the wrong tree, but this block above is supposed
to set up the cachep->nodeslists[*] that are used immediately below.
But if the loop wasn't changed from N_NORMAL_MEMORY to N_ONLINE or
whatever, you might get a bad access right below for node 0 that has
no memory, if that's the node we're running on...

> 2108                            }
> 2109                    }
> 2110            }
> 2111            cachep->nodelists[numa_node_id()]->next_reap =
> 2112                            jiffies + REAPTIMEOUT_LIST3 +
> 2113                            ((unsigned long)cachep) % REAPTIMEOUT_LIST3;
> 2114
> 2115            cpu_cache_get(cachep)->avail = 0;

Thanks,
Nish
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/