[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4BB49B04.2000303@myrealbox.com>
Date: Thu, 01 Apr 2010 09:09:24 -0400
From: Andy Lutomirski <luto@...ealbox.com>
To: linux-kernel@...r.kernel.org
Cc: Eric Anholt <eric@...olt.net>, linux-kernel@...r.kernel.org
Subject: Re: i915 lockup / extreme delay
Karl Vogel wrote:
> On Mon, Mar 22, 2010 at 4:34 PM, Eric Anholt <eric@...olt.net> wrote:
>> On Mon, 22 Mar 2010 09:11:06 +0100, Karl Vogel <karl.vogel@...il.com> wrote:
>>> On Mon, Mar 22, 2010 at 5:20 AM, Eric Anholt <eric@...olt.net> wrote:
>>>> On Sat, 20 Mar 2010 14:41:41 +0100, Karl Vogel <karl.vogel@...il.com> wrote:
>>>>> The 'effect' is that only the mouse pointer works in the X server. The
>>>>> cpu usage on the laptop during the sluggishness is minimal. When I
>>>>> suspend the game with winedbg, the X server slowly becomes responsive again.
>>>>>
>>>>> The output from latencytop seems to point to i915 being the culprit:
>>>> If there's some code doing glFlush()es, it's probably that code at
>>>> fault. You don't need to do that unless you're doing frontbuffer
>>>> rendering, and if you're doing frontbuffer rendering you should really
>>>> be doing backbuffer rendering. I don't see a kernel issue here.
>>> That doesnt explain why the box completely locks up on 2.6.34-rc2
>>> though, where only a cold reboot works.
>> Missed that part of the message. If there's a regression, bisect
>> please.
>
> Apparently the crash was caused by a hardware bug in the intel chipset
> which is 8086:2a40 rev 07. While doing the bisect I got an error:
>
> DRHD: handling fault status reg 2
> DMAR:[DMA Write] Request device [00:02.0] fault addr dd69a000
> DMAR:[fault reason 05] PTE Write access is not set
>
> After some googling around, I found this bugzilla entry which explains it:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=538163#c58
>
> The issue appears that the graphics chip is corrupting memory:
>
> "Unfortunately, this particular chipset sometimes reads from the GTT, does the
> translation, then writes the translated address back to the _original_ GTT
> instead of to the shadow GTT. That's why you're seeing real physical addresses
> where you should have 'virtual DMA addresses', and you get the faults. "
>
> Adding "intel_iommu=igfx_off" to the kernel command line resolved the issue.
> The fedora kernel automatically disables this when it detects this particular
> chipset revision.
>
> As for the freeze/slowdown right after booting, sysprof shows that more than 77%
> of the time is spent inside: drm_mode_getconnector
http://lists.freedesktop.org/archives/intel-gfx/2010-February/005922.html
I'm waiting for the encoder/connector stuff to get merged before I
either pester people about that bug again or try to fix it myself.
You can try the same hack I use (comment out the initialization of all
digital outputs) if you don't use them -- that completely fixes it for me.
--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists