lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 01 Apr 2010 09:09:24 -0400
From:	Andy Lutomirski <luto@...ealbox.com>
To:	linux-kernel@...r.kernel.org
Cc:	Eric Anholt <eric@...olt.net>, linux-kernel@...r.kernel.org
Subject: Re: i915 lockup / extreme delay

Karl Vogel wrote:
> On Mon, Mar 22, 2010 at 4:34 PM, Eric Anholt <eric@...olt.net> wrote:
>> On Mon, 22 Mar 2010 09:11:06 +0100, Karl Vogel <karl.vogel@...il.com> wrote:
>>> On Mon, Mar 22, 2010 at 5:20 AM, Eric Anholt <eric@...olt.net> wrote:
>>>> On Sat, 20 Mar 2010 14:41:41 +0100, Karl Vogel <karl.vogel@...il.com> wrote:
>>>>> The 'effect' is that only the mouse pointer works in the X server. The
>>>>> cpu usage on the laptop during the sluggishness is minimal. When I
>>>>> suspend the game with winedbg, the X server slowly becomes responsive again.
>>>>>
>>>>> The output from latencytop seems to point to i915 being the culprit:
>>>> If there's some code doing glFlush()es, it's probably that code at
>>>> fault.  You don't need to do that unless you're doing frontbuffer
>>>> rendering, and if you're doing frontbuffer rendering you should really
>>>> be doing backbuffer rendering.  I don't see a kernel issue here.
>>> That doesnt explain why the box completely locks up on 2.6.34-rc2
>>> though, where only a cold reboot works.
>> Missed that part of the message.  If there's a regression, bisect
>> please.
> 
> Apparently the crash was caused by a hardware bug in the intel chipset
> which is 8086:2a40 rev 07. While doing the bisect I got an error:
> 
> DRHD: handling fault status reg 2
> DMAR:[DMA Write] Request device [00:02.0] fault addr dd69a000
> DMAR:[fault reason 05] PTE Write access is not set
> 
> After some googling around, I found this bugzilla entry which explains it:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=538163#c58
> 
> The issue appears that the graphics chip is corrupting memory:
> 
> "Unfortunately, this particular chipset sometimes reads from the GTT, does the
> translation, then writes the translated address back to the _original_ GTT
> instead of to the shadow GTT. That's why you're seeing real physical addresses
> where you should have 'virtual DMA addresses', and you get the faults.   "
> 
> Adding "intel_iommu=igfx_off" to the kernel command line resolved the issue.
> The fedora kernel automatically disables this when it detects this particular
> chipset revision.
> 
> As for the freeze/slowdown right after booting, sysprof shows that more than 77%
> of the time is spent inside: drm_mode_getconnector

http://lists.freedesktop.org/archives/intel-gfx/2010-February/005922.html

I'm waiting for the encoder/connector stuff to get merged before I 
either pester people about that bug again or try to fix it myself.

You can try the same hack I use (comment out the initialization of all 
digital outputs) if you don't use them -- that completely fixes it for me.

--Andy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ