[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1250102430.4912.4.camel@green>
Date: Wed, 12 Aug 2009 14:40:30 -0400
From: Arnaud Faucher <arnaud.faucher@...il.com>
To: Juergen Beisert <jbe@...gutronix.de>
Cc: linux-kernel@...r.kernel.org, Mel Gorman <mel@....ul.ie>,
linux-arm-kernel@...ts.arm.linux.org.uk,
linux-hotplug@...r.kernel.org
Subject: Re: Patch "page-allocator: preserve PFN ordering when __GFP_COLD
is set" fails on my system
I have a rather similar problem on a driver that I try to keep
up-to-date with recent kernel versions
(http://code.ximeta.com/trac-ndas/ticket/1110#comment:30). The NDAS
hardware is an ethernet-enabled disk controller on one chip, kind of a
cheap iSCSI.
In my case there is no oops: the symptoms are that the read blocks seem
to be swapped or full of garbage.
After investigation in the NDAS code, the bug triggers when the driver
tries to merge adjacent requests before sending them to the controller.
I had to disable this merge in order to restore normal behavior, at the
expense of a reduced efficiency.
> On Mittwoch, 12. August 2009, Mel Gorman wrote:
> > On Wed, Aug 12, 2009 at 01:11:34PM +0200, Juergen Beisert wrote:
> > > On Mittwoch, 12. August 2009, Mel Gorman wrote:
> > > > > > I get the following Ooops message when "udevadm" is running on an
> > > > > > ARM S3C2440 CPU based system:
> > > >
> > > > This is extremely odd. All that patch is doing is changing what order
> > > > pages are returned in to the caller when __GFP_COLD is specified.
> > > > valid memory. Does reverting the patch really make the problem go away?
> > >
> > > At least I can work with the system if I remove this patch. Theres is no
> > > oops, so udev creates all the required devnodes and the system comes up
> > > into the login prompt.
> >
> > One reason I can think of that the patch would make a different to booting
> > is that there is a buffer overrun somewhere. When the pages in one order,
> > the buffer overrun is into pages that are not being used so it's not
> > spotted. In the other order, the overrun causes damage. The patch only
> > alters the order of pages in a linked list and ordinarily that shouldn't
> > make any functional difference.
> >
[...]
> After this oops, system startup continues. Then the next oops occurs:
>
> This one is new, since I try to mount the connected SD card.
>
Mel's buffer overrun theory seems to apply in the NDAS driver case,
where the original requests adjacency test seems faulty.
May it also be the cause of the SD mounting crash ?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists