linux-kernel - Re: i915 lockup / extreme delay

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bad735c81003270254i5e1cdb3dv26c1566e5d005e9f@mail.gmail.com>
Date:	Sat, 27 Mar 2010 10:54:36 +0100
From:	Karl Vogel <karl.vogel@...il.com>
To:	Eric Anholt <eric@...olt.net>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: i915 lockup / extreme delay

On Mon, Mar 22, 2010 at 4:34 PM, Eric Anholt <eric@...olt.net> wrote:
> On Mon, 22 Mar 2010 09:11:06 +0100, Karl Vogel <karl.vogel@...il.com> wrote:
>> On Mon, Mar 22, 2010 at 5:20 AM, Eric Anholt <eric@...olt.net> wrote:
>> > On Sat, 20 Mar 2010 14:41:41 +0100, Karl Vogel <karl.vogel@...il.com> wrote:
>> >> The 'effect' is that only the mouse pointer works in the X server. The
>> >> cpu usage on the laptop during the sluggishness is minimal. When I
>> >> suspend the game with winedbg, the X server slowly becomes responsive again.
>> >>
>> >> The output from latencytop seems to point to i915 being the culprit:
>> >
>> > If there's some code doing glFlush()es, it's probably that code at
>> > fault.  You don't need to do that unless you're doing frontbuffer
>> > rendering, and if you're doing frontbuffer rendering you should really
>> > be doing backbuffer rendering.  I don't see a kernel issue here.
>>
>> That doesnt explain why the box completely locks up on 2.6.34-rc2
>> though, where only a cold reboot works.
>
> Missed that part of the message.  If there's a regression, bisect
> please.

Apparently the crash was caused by a hardware bug in the intel chipset
which is 8086:2a40 rev 07. While doing the bisect I got an error:

DRHD: handling fault status reg 2
DMAR:[DMA Write] Request device [00:02.0] fault addr dd69a000
DMAR:[fault reason 05] PTE Write access is not set

After some googling around, I found this bugzilla entry which explains it:

https://bugzilla.redhat.com/show_bug.cgi?id=538163#c58

The issue appears that the graphics chip is corrupting memory:

"Unfortunately, this particular chipset sometimes reads from the GTT, does the
translation, then writes the translated address back to the _original_ GTT
instead of to the shadow GTT. That's why you're seeing real physical addresses
where you should have 'virtual DMA addresses', and you get the faults.   "

Adding "intel_iommu=igfx_off" to the kernel command line resolved the issue.
The fedora kernel automatically disables this when it detects this particular
chipset revision.

As for the freeze/slowdown right after booting, sysprof shows that more than 77%
of the time is spent inside: drm_mode_getconnector

[/usr/bin/Xorg]                                                   0.00%  80.29%
 ioctl                                                           0.00%  78.47%
   - - kernel - -                                                0.01%  78.47%
     system_call_fastpath                                        0.00%  77.15%
       sys_ioctl                                                 0.00%  77.15%
         do_vfs_ioctl                                            0.01%  77.15%
           vfs_ioctl                                             0.00%  77.14%
             drm_ioctl                                           0.01%  77.14%
               drm_mode_getconnector                             0.00%  77.02%
                 drm_helper_probe_single_connector_modes         0.00%  77.02%
                   intel_lvds_get_modes                          0.00%  62.46%
                     intel_ddc_get_modes                         0.00%  62.46%
                       drm_get_edid                              0.00%  62.45%
                         drm_ddc_read_edid                       0.00%  62.45%
                           drm_do_probe_ddc_edid                 0.00%  62.45%
                             i2c_transfer                        0.00%  62.45%
                               bit_xfer                          0.01%  62.44%
                                 sclhi                           0.00%  25.76%
                                   set_clock                     0.08%  12.86%
                                   __udelay                      0.00%  12.78%
                                   get_clock                     0.11%   0.11%
                                   __const_udelay                0.01%   0.01%
                                 set_clock                       0.12%  12.54%
                                   __const_udelay                0.01%  12.42%
                                   __delay                       0.00%   0.00%
                                 __udelay                        0.00%  12.08%
                                 acknak                          0.00%   7.68%
                                 sdahi                           0.01%   2.30%
                                 try_address                     0.00%   1.20%
                                 i2c_outb                        0.00%   0.59%
                                 get_data                        0.11%   0.11%
                                 i2c_stop                        0.00%   0.09%
                                 i2c_repstart                    0.00%   0.07%
                                 i2c_start                       0.00%   0.01%
                                 __const_udelay                  0.01%   0.01%
                               __udelay                          0.01%   0.01%
                       drm_add_edid_modes                        0.00%   0.00%
                       drm_mode_connector_update_edid_property   0.00%   0.00%
                   intel_hdmi_detect                             0.00%   8.01%
                   intel_dp_detect                               0.00%   4.91%
                   intel_tv_detect                               0.00%   1.23%
                   intel_lvds_detect                             0.00%   0.38%
                   intel_crt_detect                              0.00%   0.03%
                   drm_get_connector_name                        0.00%   0.01%
               i915_gem_execbuffer                               0.00%   0.06%
               i915_gem_mmap_ioctl                               0.00%   0.01%
               i915_gem_set_domain_ioctl                         0.00%   0.01%
               drm_gem_close_ioctl                               0.00%   0.01%
               drm_mode_object_find                              0.00%   0.00%
               i915_gem_busy_ioctl                               0.00%   0.00%
               lock_kernel                                       0.00%   0.00%
               i915_gem_create_ioctl                             0.00%   0.00%
               copy_to_user                                      0.00%   0.00%
             copy_user_generic_string                            0.00%   0.00%