[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090812135036.GE19269@csn.ul.ie>
Date: Wed, 12 Aug 2009 14:50:36 +0100
From: Mel Gorman <mel@....ul.ie>
To: Juergen Beisert <jbe@...gutronix.de>
Cc: linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.arm.linux.org.uk,
linux-hotplug@...r.kernel.org
Subject: Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD
is set" fails on my system
On Wed, Aug 12, 2009 at 01:11:34PM +0200, Juergen Beisert wrote:
> On Mittwoch, 12. August 2009, Mel Gorman wrote:
> > > > I get the following Ooops message when "udevadm" is running on an ARM
> > > > S3C2440 CPU based system:
> >
> > This is extremely odd. All that patch is doing is changing what order pages
> > are returned in to the caller when __GFP_COLD is specified. valid memory.
> > Does reverting the patch really make the problem go away?
>
> At least I can work with the system if I remove this patch. Theres is no oops,
> so udev creates all the required devnodes and the system comes up into the
> login prompt.
>
One reason I can think of that the patch would make a different to booting
is that there is a buffer overrun somewhere. When the pages in one order, the
buffer overrun is into pages that are not being used so it's not spotted. In
the other order, the overrun causes damage. The patch only alters the
order of pages in a linked list and ordinarily that shouldn't make any
functional difference.
Can you enable the config option DEBUG_PAGEALLOC please and tell me if
that blows up in some unexpected fashion? It would also be helpful if
you could enable all slab/slqb/slub debugging (whichever one you are
using).
> > > > [...]
> > > > starting udevd...done
> > > > Unable to handle kernel paging request at virtual address e3540000
> > > > pgd = c39d4000
> > > > [e3540000] *pgd=00000000
> > > > Internal error: Oops: 5 [#1]
> > > > Modules linked in:
> > > > CPU: 0 Not tainted (2.6.31-rc4-00296-ge084b2d-dirty #10)
> > > > PC is at strlen+0xc/0x20
> > > > LR is at kobject_get_path+0x24/0xa4
> >
> > I haven't tackled this sort of bug before but it looks more likely that
> > there is garbage in the sysfs tree that is being tripped up on.
>
> Yes, I think so, too. Because the same binary rc5 image runs on an S3C2410 CPU
> without an oops, but oopses on an S3C2440 (both CPUs are nearly the same, but
> only nearly). But how to track down such a failure?
>
Lets start with a full dmesg with CONFIG_DEBUG_KOBJECT and
CONFIG_DEBUG_OBJECTS set and see if anything springs up that looks
unusual on that platform.
Thanks
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists