lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 28 Feb 2015 15:20:37 +0300
From:	Andrey Skvortsov <andrej.skvortzov@...il.com>
To:	Chris Wilson <chris@...is-wilson.co.uk>,
	Daniel Vetter <daniel.vetter@...el.com>,
	Jani Nikula <jani.nikula@...ux.intel.com>,
	David Airlie <airlied@...ux.ie>,
	intel-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
	linux-kernel@...r.kernel.org, Sitsofe Wheeler <sitsofe@...il.com>
Subject: Re: [Intel-gfx] [Regression] WARNING:
 drivers/gpu/drm/i915/i915_gem.c:4525 i915_gem_free_object

On 24 Feb, Daniel Vetter wrote:
> On Mon, Feb 23, 2015 at 09:20:31PM +0000, Chris Wilson wrote:
> > On Mon, Feb 23, 2015 at 11:12:39PM +0300, Andrey Skvortsov wrote:
> > > Hi, 
> > > 
> > > This warning is moved from linux-next to v4.0-rc1 now. After system boot is just a black screen.
> > > I ssh'ed into the machine and saved the log. I attached updated dmesg.log with drm.debug=6. Hopefully it helps. 
> > > If you need any other debug information, traces, core dump or something else. Feel free to ask.
> > 
> > The warning from free_object is annoying (and quite possibly dangerous),
> > but the actual hang during boot is:
> > 
> > [  243.876375] INFO: task Xorg:2422 blocked for more than 120 seconds.
> > [  243.876382]       Tainted: G        W   E   4.0.0-rc1-150223- #2
> > [  243.876388] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [  243.876393] Xorg            D ffff88019fc12dc0     0  2422   2180 0x00400000
> > [  243.876404]  ffff8800dabfe1a0 0000000000000002 ffff880194537fd8 ffff880194537ba0
> > [  243.876416]  ffff8800dab9e22c ffff8800dabfe1a0 ffff8800dab9e230 00000000ffffffff
> > [  243.876426]  ffffffff813e2479 ffff8800dab9e228 ffffffff813e26a7 0000000000000000
> > [  243.876438] Call Trace:
> > [  243.876449]  [<ffffffff813e2479>] ? schedule+0x6f/0x7c
> > [  243.876459]  [<ffffffff813e26a7>] ? schedule_preempt_disabled+0x15/0x21
> > [  243.876469]  [<ffffffff813e3347>] ? __ww_mutex_lock_slowpath+0xdf/0x1c2
> > [  243.876480]  [<ffffffff813e3446>] ? __ww_mutex_lock+0x1c/0x93
> > [  243.876541]  [<ffffffffa050e70d>] ? modeset_lock+0x8f/0xf2 [drm]
> > [  243.876632]  [<ffffffffa09aa0b9>] ? intel_get_load_detect_pipe+0x80/0x427 [i915]
> > [  243.876674]  [<ffffffffa04fd42f>] ? drm_ut_debug_printk+0x5e/0x63 [drm]
> > [  243.876771]  [<ffffffffa09d4661>] ? intel_tv_detect+0x115/0x43a [i915]
> > [  243.876783]  [<ffffffff810608d9>] ? preempt_count_sub+0xbf/0xca
> > [  243.876809]  [<ffffffffa05d6f24>] ? drm_helper_probe_single_connector_modes_merge_bits+0xc6/0x38d [drm_kms_helper]
> > [  243.876860]  [<ffffffffa0505b5d>] ? drm_mode_getconnector+0xf4/0x2ac [drm]
> > [  243.876900]  [<ffffffffa04fa911>] ? drm_ioctl+0x338/0x3c5 [drm]
> > [  243.876949]  [<ffffffffa0505a69>] ? drm_mode_getcrtc+0xb3/0xb3 [drm]
> > [  243.876961]  [<ffffffff81167deb>] ? fsnotify+0x314/0x35d
> > [  243.876973]  [<ffffffff811487be>] ? do_vfs_ioctl+0x379/0x431
> > [  243.876983]  [<ffffffff811488cc>] ? SyS_ioctl+0x56/0x7c
> > [  243.876994]  [<ffffffff813e5152>] ? system_call_fastpath+0x12/0x17
> > 
> > i.e. it is a mutex deadlock inside tv detect. Daniel does that make sense?
> 
> Botch locking rework for atomic. Fix is
> 
> https://patchwork.kernel.org/patch/5861631/
> 
> and will land as soon as an affected user has provided a tested-by.
> Andrey, can you pls give this a spin?

Hi,

Tested-by: Andrey Skvortsov <andrej.skvortzov@...il.com>

The patch certainly fixes deadlock and Xorg is running again.


Unfortunately this is not the last bug, that breaks i915/drm working
on my laptop. Sometimes system successfully loads with couple warnings mentioned in
previous mail:

[   26.922953] WARNING: CPU: 1 PID: 767 at drivers/gpu/drm/i915/i915_gem.c:4525 i915_gem_free_object+0x13f/0x288 [i915]()
[   26.922954] WARN_ON(obj->frontbuffer_bits)

and

[   36.794045] WARNING: CPU: 0 PID: 18 at include/linux/kref.h:47 drm_framebuffer_reference+0x60/0x6b [drm]()

but pretty often kernel crashes during a boot. I caught the kernel log
over netconsole. 

[   36.519781] BUG: unable to handle kernel NULL pointer dereference at 00000000000002ec
[   36.520752] IP: [<ffffffff8145e13a>] mutex_lock+0xe/0x29
[   36.520752] PGD 1952fb067 PUD 193c64067 PMD 0 
[   36.520752] Oops: 0002 [#1] PREEMPT SMP 
[   36.520752] Modules linked in: cfg80211(E) bnep(E) cpufreq_stats(E) cpufreq_powersave(E) cpufreq_userspace(E) cpufreq_conservative(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) cdc_ether(E) usbnet(E) coretemp(E) cdc_wdm(E) cdc_acm(E) kvm_intel(E) joydev(E) kvm(E) i8k(E) i915(E) btusb(E) snd_pcsp(E) bluetooth(E) psmouse(E) evdev(E) snd_hda_codec_generic(E) rfkill(E) lpc_ich(E) mfd_core(E) i2c_i801(E) serio_raw(E) snd_hda_intel(E) drm_kms_helper(E) snd_hda_controller(E) snd_hda_codec(E) drm(E) snd_hwdep(E) snd_pcm(E) i2c_algo_bit(E) i2c_core(E) battery(E) button(E) video(E) ac(E) snd_timer(E) snd(E) soundcore(E) acpi_cpufreq(E) processor(E) fuse(E) parport_pc(E) ppdev(E) lp(E) parport(E) autofs4(E) ext4(E) crc16(E) jbd2(E) mbcache(E) dm_mod(E) sd_mod(E) ata_generic(E) ahci(E) libahci(E) firewire_ohci(E) ata_piix(E) libata(E) sdhci_pci(E) firewire_core(E) scsi_mod(E) sdhci(E) crc_itu_t(E) mmc_core(E) thermal(E) thermal_sys(E)
[   36.520752] CPU: 1 PID: 19 Comm: kworker/1:1 Tainted: G        W   E   4.0.0-rc1-150225--00001-gb802a6b #10
[   36.520752] Hardware name: Dell Inc. Vostro 1500                     /0NX907, BIOS A06 04/21/2008
[   36.520752] Workqueue: events output_poll_execute [drm_kms_helper]
[   36.520752] task: ffff880197f69aa0 ti: ffff880197d1c000 task.ti: ffff880197d1c000
[   36.520752] RIP: 0010:[<ffffffff8145e13a>]  [<ffffffff8145e13a>] mutex_lock+0xe/0x29
[   36.520752] RSP: 0018:ffff880197d1f838  EFLAGS: 00010246
[   36.520752] RAX: ffff8801974c80c0 RBX: 00000000000002ec RCX: 0000000080000000
[   36.520752] RDX: ffff88019fd00000 RSI: ffffffffa03fda6e RDI: 00000000000002ec
[   36.520752] RBP: ffff880197d1f848 R08: 0000000000000001 R09: ffffffff81ea9154
[   36.520752] R10: ffffffff81ea9154 R11: ffff88019fd0d300 R12: 00000000000002ec
[   36.520752] R13: 0000000000000004 R14: ffff880197744d80 R15: ffff88019547e000
[   36.520752] FS:  0000000000000000(0000) GS:ffff88019fd00000(0000) knlGS:0000000000000000
[   36.520752] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   36.520752] CR2: 00000000000002ec CR3: 00000000d71b9000 CR4: 00000000000006e0
[   36.520752] Stack:
[   36.520752]  ffff88019fd00000 ffff880197744d88 ffff880197d1f878 ffffffffa03fda98
[   36.520752]  ffffffffa03fda6e ffff880197744d88 0000000000000000 ffff880196c2e400
[   36.520752]  ffff880197d1f898 ffffffffa03fde15 ffff880197744d80 ffff8801974c80c0
[   36.520752] Call Trace:
[   36.520752]  [<ffffffffa03fda98>] drm_framebuffer_free+0x2a/0x56 [drm]
[   36.520752]  [<ffffffffa03fda6e>] ? drm_framebuffer_unregister_private+0x43/0x43 [drm]
[   36.520752]  [<ffffffffa03fde15>] kref_sub.constprop.33+0x34/0x3e [drm]
[   36.520752]  [<ffffffffa03fe098>] drm_framebuffer_unreference+0x47/0x4b [drm]
[   36.520752]  [<ffffffffa040b86c>] drm_atomic_set_fb_for_plane+0x20/0x7f [drm]
[   36.520752]  [<ffffffffa049bcc6>] drm_plane_helper_update+0x74/0xca [drm_kms_helper]
[   36.520752]  [<ffffffffa074c88c>] __intel_set_mode+0x767/0x86c [i915]
[   36.520752]  [<ffffffffa075167e>] intel_set_mode+0x6d/0x8e [i915]
[   36.520752]  [<ffffffffa0751b2f>] intel_get_load_detect_pipe+0x3cc/0x46f [i915]
[   36.520752]  [<ffffffffa077d4c4>] intel_tv_detect+0x117/0x459 [i915]
[   36.520752]  [<ffffffff8107eea6>] ? vprintk_default+0x1d/0x1f
[   36.520752]  [<ffffffff8107ee47>] ? vprintk_emit+0x3f6/0x438
[   36.520752]  [<ffffffff8107ee57>] ? vprintk_emit+0x406/0x438
[   36.520752]  [<ffffffffa049b0d3>] drm_helper_probe_single_connector_modes_merge_bits+0xcd/0x3a1 [drm_kms_helper]
[   36.520752]  [<ffffffffa049b3cc>] drm_helper_probe_single_connector_modes+0x13/0x15 [drm_kms_helper]
[   36.520752]  [<ffffffffa04a2494>] drm_fb_helper_probe_connector_modes+0x43/0x5b [drm_kms_helper]
[   36.520752]  [<ffffffffa04a3fa4>] drm_fb_helper_hotplug_event+0x7a/0xb2 [drm_kms_helper]
[   36.520752]  [<ffffffffa075f9a4>] intel_fbdev_output_poll_changed+0x1e/0x20 [i915]
[   36.520752]  [<ffffffffa049adcd>] drm_kms_helper_hotplug_event+0x28/0x2c [drm_kms_helper]
[   36.520752]  [<ffffffffa049aefe>] output_poll_execute+0x12d/0x14e [drm_kms_helper]
[   36.520752]  [<ffffffff81057b52>] process_one_work+0x16e/0x294
[   36.520752]  [<ffffffff81057e58>] worker_thread+0x1b1/0x288
[   36.520752]  [<ffffffff81057ca7>] ? process_scheduled_works+0x2f/0x2f
[   36.520752]  [<ffffffff8105bb92>] kthread+0xa5/0xad
[   36.520752]  [<ffffffff8105baed>] ? __kthread_parkme+0x61/0x61
[   36.520752]  [<ffffffff8145fd6c>] ret_from_fork+0x7c/0xb0
[   36.520752]  [<ffffffff8105baed>] ? __kthread_parkme+0x61/0x61
[   36.520752] Code: 05 bc c8 ba 7e 85 c0 75 05 e8 d2 b6 db ff 48 83 c4 28 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 66 66 66 90 55 48 89 e5 53 48 89 fb 52 <f0> ff 0f 79 05 e8 b1 fe ff ff 65 48 8b 04 25 00 aa 00 00 48 89 
[   36.520752] RIP  [<ffffffff8145e13a>] mutex_lock+0xe/0x29
[   36.520752]  RSP <ffff880197d1f838>
[   36.520752] CR2: 00000000000002ec
[   36.520752] ---[ end trace df8a9d2a655f33b0 ]---


According to the backtrace this looks like a drm regression. The full
kernel log with drm.debug=6 is attached. It was taken from v4.0-rc1
with a patch mentioned above on top of that. The same is for clean v4.0-rc1.



-- 
Best regards,
Andrey Skvortsov

Secure e-mail with gnupg: See http://www.gnupg.org/
PGP Key ID: 0x57A3AEAD



View attachment "dmesg-4.0-rc1-crash.log" of type "text/plain" (200323 bytes)

Download attachment "signature.asc" of type "application/pgp-signature" (820 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ