linux-kernel - Re: Commit for mm/page_alloc.c breaks boot process on my machine

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080204104232.GB29484@csn.ul.ie>
Date:	Mon, 4 Feb 2008 10:42:32 +0000
From:	Mel Gorman <mel@....ul.ie>
To:	Gerhard Pircher <gerhard_pircher@....net>
Cc:	linux-kernel@...r.kernel.org, linuxppc-dev@...abs.org
Subject: Re: Commit for mm/page_alloc.c breaks boot process on my machine

On (01/02/08 22:06), Gerhard Pircher didst pronounce:
> 
> -------- Original-Nachricht --------
> > Datum: Fri, 1 Feb 2008 20:25:18 +0000
> > Von: Mel Gorman <mel@....ul.ie>
> > An: Gerhard Pircher <gerhard_pircher@....net>
> > CC: linux-kernel@...r.kernel.org
> > Betreff: Re: Commit for mm/page_alloc.c breaks boot process on my machine
> 
> > I meant uninitialised memory but I also wonder could something like this
> > happen if you are trying to use memory that doesn't exist. i.e. you are
> > trying to access more memory than you really have but you indicate later
> > that this is not the case.
>
> Good question. The memory is in the physical address range from 0x00000000
> to 0x60000000 (1536MB).
> 
> > > > 2. Any chance of seeing a dmesg log?
> > > That's a little bit of a problem. The kernel log in memory doesn't show
> > > any kernel oops, but is also fragmented (small fragments seem to have
> > > been overwritten with 0x0).
> > 
> > err, that doesn't sound very healthy.
>
> Yeah, I know. But the platform code hasn't changed much when porting it
> from arch/ppc to arch/powerpc. That's why I'm a little bit lost in this
> case. :-)
> 

I'm at a bit of a loss to guess what might have changed in powerpc code
that would explain this. I've added the linuxppc-dev mailing list in
case they can make a guess.

I think you are also going to need to start bisecting searching
specifically for the patch that causes these overwrites.

> > > Well, I can't answer this question. The kernel currently locks up when
> > > loading the INIT program. But that is another problem (I still have to
> > > bisect it) and doesn't seem to be related to this problem.
> > 
> > INIT would be the first MOVABLE allocation so it would be using memory
> > at the end of the physical adddress range. i.e. the crash happens when
> > memory towards the end and the only difference between the patch applied
> > and reverted is when it happens.
> Oh, that sounds interesting!
> 
> > Could you try booting with 16MB less memory using mem=?
> I started the kernel with 512MB RAM (mem=496) and 1.5GB (mem=1520). The
> kernel oopes in both cases with a "Unable to handle kernel paging request
> for data address 0xbffff000", followed by a "Oops: kernel access of bad
> area, sig 11" message. The end of the stack trace shows the start_here()
> function.
> I'm not a PowerPC expert, but if 0xbffff000 is a virtual address, then
> it would be in the user program address space, right? If it is a physical
> address, then it is somewhere in the unallocated PCI address space.
> 

It's a virtual address so it depends on the value of CONFIG_KERNEL_START
as to whether this is a user program address or not.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/