[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56111.84.105.60.153.1287521237.squirrel@gate.crashing.org>
Date: Tue, 19 Oct 2010 22:47:17 +0200 (CEST)
From: "Segher Boessenkool" <segher@...nel.crashing.org>
To: pacman@...h.dhis.org
Cc: "Benjamin Herrenschmidt" <benh@...nel.crashing.org>,
"Mel Gorman" <mel@....ul.ie>, linux-mm@...ck.org,
"Andrew Morton" <akpm@...ux-foundation.org>,
linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org
Subject: Re: PROBLEM: memory corrupting bug, bisected to 6dda9d55
> I made a new discovery.
And this nails it :-)
> So then I ran
> dd if=/dev/mem bs=4 count=1 skip=$((0xfc5c080/4)) | od -t x4
> a few times very fast, plucking the first affected word directly out of
> memory by its physical address. The result:
>
> The low 16 bits are always zero as before. The high 16 bits are a counter,
> being incremented at about 1000Hz (as close as I could measure with a
> crude
> shell script. 1024Hz would also be within the margin of error). And it's
> little-endian.
> So what type of driver, firmware, or hardware bug puts a 16-bit 1000Hz
> timer
> in memory, and does it in little-endian instead of the CPU's native byte
> order? And why does it stop doing it some time during the early init
> scripts,
> shortly after the root filesystem fsck?
It looks like it is the frame counter in an USB OHCI HCCA.
16-bit, 1kHz update, offset x'80 in a page.
So either the kernel forgot to call quiesce on it, or the firmware
doesn't implement that, or the firmware messed up some other way.
Segher
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists