[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150228122037.GA12590@nest>
Date: Sat, 28 Feb 2015 15:20:37 +0300
From: Andrey Skvortsov <andrej.skvortzov@...il.com>
To: Chris Wilson <chris@...is-wilson.co.uk>,
Daniel Vetter <daniel.vetter@...el.com>,
Jani Nikula <jani.nikula@...ux.intel.com>,
David Airlie <airlied@...ux.ie>,
intel-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
linux-kernel@...r.kernel.org, Sitsofe Wheeler <sitsofe@...il.com>
Subject: Re: [Intel-gfx] [Regression] WARNING:
drivers/gpu/drm/i915/i915_gem.c:4525 i915_gem_free_object
On 24 Feb, Daniel Vetter wrote:
> On Mon, Feb 23, 2015 at 09:20:31PM +0000, Chris Wilson wrote:
> > On Mon, Feb 23, 2015 at 11:12:39PM +0300, Andrey Skvortsov wrote:
> > > Hi,
> > >
> > > This warning is moved from linux-next to v4.0-rc1 now. After system boot is just a black screen.
> > > I ssh'ed into the machine and saved the log. I attached updated dmesg.log with drm.debug=6. Hopefully it helps.
> > > If you need any other debug information, traces, core dump or something else. Feel free to ask.
> >
> > The warning from free_object is annoying (and quite possibly dangerous),
> > but the actual hang during boot is:
> >
> > [ 243.876375] INFO: task Xorg:2422 blocked for more than 120 seconds.
> > [ 243.876382] Tainted: G W E 4.0.0-rc1-150223- #2
> > [ 243.876388] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [ 243.876393] Xorg D ffff88019fc12dc0 0 2422 2180 0x00400000
> > [ 243.876404] ffff8800dabfe1a0 0000000000000002 ffff880194537fd8 ffff880194537ba0
> > [ 243.876416] ffff8800dab9e22c ffff8800dabfe1a0 ffff8800dab9e230 00000000ffffffff
> > [ 243.876426] ffffffff813e2479 ffff8800dab9e228 ffffffff813e26a7 0000000000000000
> > [ 243.876438] Call Trace:
> > [ 243.876449] [<ffffffff813e2479>] ? schedule+0x6f/0x7c
> > [ 243.876459] [<ffffffff813e26a7>] ? schedule_preempt_disabled+0x15/0x21
> > [ 243.876469] [<ffffffff813e3347>] ? __ww_mutex_lock_slowpath+0xdf/0x1c2
> > [ 243.876480] [<ffffffff813e3446>] ? __ww_mutex_lock+0x1c/0x93
> > [ 243.876541] [<ffffffffa050e70d>] ? modeset_lock+0x8f/0xf2 [drm]
> > [ 243.876632] [<ffffffffa09aa0b9>] ? intel_get_load_detect_pipe+0x80/0x427 [i915]
> > [ 243.876674] [<ffffffffa04fd42f>] ? drm_ut_debug_printk+0x5e/0x63 [drm]
> > [ 243.876771] [<ffffffffa09d4661>] ? intel_tv_detect+0x115/0x43a [i915]
> > [ 243.876783] [<ffffffff810608d9>] ? preempt_count_sub+0xbf/0xca
> > [ 243.876809] [<ffffffffa05d6f24>] ? drm_helper_probe_single_connector_modes_merge_bits+0xc6/0x38d [drm_kms_helper]
> > [ 243.876860] [<ffffffffa0505b5d>] ? drm_mode_getconnector+0xf4/0x2ac [drm]
> > [ 243.876900] [<ffffffffa04fa911>] ? drm_ioctl+0x338/0x3c5 [drm]
> > [ 243.876949] [<ffffffffa0505a69>] ? drm_mode_getcrtc+0xb3/0xb3 [drm]
> > [ 243.876961] [<ffffffff81167deb>] ? fsnotify+0x314/0x35d
> > [ 243.876973] [<ffffffff811487be>] ? do_vfs_ioctl+0x379/0x431
> > [ 243.876983] [<ffffffff811488cc>] ? SyS_ioctl+0x56/0x7c
> > [ 243.876994] [<ffffffff813e5152>] ? system_call_fastpath+0x12/0x17
> >
> > i.e. it is a mutex deadlock inside tv detect. Daniel does that make sense?
>
> Botch locking rework for atomic. Fix is
>
> https://patchwork.kernel.org/patch/5861631/
>
> and will land as soon as an affected user has provided a tested-by.
> Andrey, can you pls give this a spin?
Hi,
Tested-by: Andrey Skvortsov <andrej.skvortzov@...il.com>
The patch certainly fixes deadlock and Xorg is running again.
Unfortunately this is not the last bug, that breaks i915/drm working
on my laptop. Sometimes system successfully loads with couple warnings mentioned in
previous mail:
[ 26.922953] WARNING: CPU: 1 PID: 767 at drivers/gpu/drm/i915/i915_gem.c:4525 i915_gem_free_object+0x13f/0x288 [i915]()
[ 26.922954] WARN_ON(obj->frontbuffer_bits)
and
[ 36.794045] WARNING: CPU: 0 PID: 18 at include/linux/kref.h:47 drm_framebuffer_reference+0x60/0x6b [drm]()
but pretty often kernel crashes during a boot. I caught the kernel log
over netconsole.
[ 36.519781] BUG: unable to handle kernel NULL pointer dereference at 00000000000002ec
[ 36.520752] IP: [<ffffffff8145e13a>] mutex_lock+0xe/0x29
[ 36.520752] PGD 1952fb067 PUD 193c64067 PMD 0
[ 36.520752] Oops: 0002 [#1] PREEMPT SMP
[ 36.520752] Modules linked in: cfg80211(E) bnep(E) cpufreq_stats(E) cpufreq_powersave(E) cpufreq_userspace(E) cpufreq_conservative(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) cdc_ether(E) usbnet(E) coretemp(E) cdc_wdm(E) cdc_acm(E) kvm_intel(E) joydev(E) kvm(E) i8k(E) i915(E) btusb(E) snd_pcsp(E) bluetooth(E) psmouse(E) evdev(E) snd_hda_codec_generic(E) rfkill(E) lpc_ich(E) mfd_core(E) i2c_i801(E) serio_raw(E) snd_hda_intel(E) drm_kms_helper(E) snd_hda_controller(E) snd_hda_codec(E) drm(E) snd_hwdep(E) snd_pcm(E) i2c_algo_bit(E) i2c_core(E) battery(E) button(E) video(E) ac(E) snd_timer(E) snd(E) soundcore(E) acpi_cpufreq(E) processor(E) fuse(E) parport_pc(E) ppdev(E) lp(E) parport(E) autofs4(E) ext4(E) crc16(E) jbd2(E) mbcache(E) dm_mod(E) sd_mod(E) ata_generic(E) ahci(E) libahci(E) firewire_ohci(E) ata_piix(E) libata(E) sdhci_pci(E) firewire_core(E) scsi_mod(E) sdhci(E) crc_itu_t(E) mmc_core(E) thermal(E) thermal_sys(E)
[ 36.520752] CPU: 1 PID: 19 Comm: kworker/1:1 Tainted: G W E 4.0.0-rc1-150225--00001-gb802a6b #10
[ 36.520752] Hardware name: Dell Inc. Vostro 1500 /0NX907, BIOS A06 04/21/2008
[ 36.520752] Workqueue: events output_poll_execute [drm_kms_helper]
[ 36.520752] task: ffff880197f69aa0 ti: ffff880197d1c000 task.ti: ffff880197d1c000
[ 36.520752] RIP: 0010:[<ffffffff8145e13a>] [<ffffffff8145e13a>] mutex_lock+0xe/0x29
[ 36.520752] RSP: 0018:ffff880197d1f838 EFLAGS: 00010246
[ 36.520752] RAX: ffff8801974c80c0 RBX: 00000000000002ec RCX: 0000000080000000
[ 36.520752] RDX: ffff88019fd00000 RSI: ffffffffa03fda6e RDI: 00000000000002ec
[ 36.520752] RBP: ffff880197d1f848 R08: 0000000000000001 R09: ffffffff81ea9154
[ 36.520752] R10: ffffffff81ea9154 R11: ffff88019fd0d300 R12: 00000000000002ec
[ 36.520752] R13: 0000000000000004 R14: ffff880197744d80 R15: ffff88019547e000
[ 36.520752] FS: 0000000000000000(0000) GS:ffff88019fd00000(0000) knlGS:0000000000000000
[ 36.520752] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 36.520752] CR2: 00000000000002ec CR3: 00000000d71b9000 CR4: 00000000000006e0
[ 36.520752] Stack:
[ 36.520752] ffff88019fd00000 ffff880197744d88 ffff880197d1f878 ffffffffa03fda98
[ 36.520752] ffffffffa03fda6e ffff880197744d88 0000000000000000 ffff880196c2e400
[ 36.520752] ffff880197d1f898 ffffffffa03fde15 ffff880197744d80 ffff8801974c80c0
[ 36.520752] Call Trace:
[ 36.520752] [<ffffffffa03fda98>] drm_framebuffer_free+0x2a/0x56 [drm]
[ 36.520752] [<ffffffffa03fda6e>] ? drm_framebuffer_unregister_private+0x43/0x43 [drm]
[ 36.520752] [<ffffffffa03fde15>] kref_sub.constprop.33+0x34/0x3e [drm]
[ 36.520752] [<ffffffffa03fe098>] drm_framebuffer_unreference+0x47/0x4b [drm]
[ 36.520752] [<ffffffffa040b86c>] drm_atomic_set_fb_for_plane+0x20/0x7f [drm]
[ 36.520752] [<ffffffffa049bcc6>] drm_plane_helper_update+0x74/0xca [drm_kms_helper]
[ 36.520752] [<ffffffffa074c88c>] __intel_set_mode+0x767/0x86c [i915]
[ 36.520752] [<ffffffffa075167e>] intel_set_mode+0x6d/0x8e [i915]
[ 36.520752] [<ffffffffa0751b2f>] intel_get_load_detect_pipe+0x3cc/0x46f [i915]
[ 36.520752] [<ffffffffa077d4c4>] intel_tv_detect+0x117/0x459 [i915]
[ 36.520752] [<ffffffff8107eea6>] ? vprintk_default+0x1d/0x1f
[ 36.520752] [<ffffffff8107ee47>] ? vprintk_emit+0x3f6/0x438
[ 36.520752] [<ffffffff8107ee57>] ? vprintk_emit+0x406/0x438
[ 36.520752] [<ffffffffa049b0d3>] drm_helper_probe_single_connector_modes_merge_bits+0xcd/0x3a1 [drm_kms_helper]
[ 36.520752] [<ffffffffa049b3cc>] drm_helper_probe_single_connector_modes+0x13/0x15 [drm_kms_helper]
[ 36.520752] [<ffffffffa04a2494>] drm_fb_helper_probe_connector_modes+0x43/0x5b [drm_kms_helper]
[ 36.520752] [<ffffffffa04a3fa4>] drm_fb_helper_hotplug_event+0x7a/0xb2 [drm_kms_helper]
[ 36.520752] [<ffffffffa075f9a4>] intel_fbdev_output_poll_changed+0x1e/0x20 [i915]
[ 36.520752] [<ffffffffa049adcd>] drm_kms_helper_hotplug_event+0x28/0x2c [drm_kms_helper]
[ 36.520752] [<ffffffffa049aefe>] output_poll_execute+0x12d/0x14e [drm_kms_helper]
[ 36.520752] [<ffffffff81057b52>] process_one_work+0x16e/0x294
[ 36.520752] [<ffffffff81057e58>] worker_thread+0x1b1/0x288
[ 36.520752] [<ffffffff81057ca7>] ? process_scheduled_works+0x2f/0x2f
[ 36.520752] [<ffffffff8105bb92>] kthread+0xa5/0xad
[ 36.520752] [<ffffffff8105baed>] ? __kthread_parkme+0x61/0x61
[ 36.520752] [<ffffffff8145fd6c>] ret_from_fork+0x7c/0xb0
[ 36.520752] [<ffffffff8105baed>] ? __kthread_parkme+0x61/0x61
[ 36.520752] Code: 05 bc c8 ba 7e 85 c0 75 05 e8 d2 b6 db ff 48 83 c4 28 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 66 66 66 90 55 48 89 e5 53 48 89 fb 52 <f0> ff 0f 79 05 e8 b1 fe ff ff 65 48 8b 04 25 00 aa 00 00 48 89
[ 36.520752] RIP [<ffffffff8145e13a>] mutex_lock+0xe/0x29
[ 36.520752] RSP <ffff880197d1f838>
[ 36.520752] CR2: 00000000000002ec
[ 36.520752] ---[ end trace df8a9d2a655f33b0 ]---
According to the backtrace this looks like a drm regression. The full
kernel log with drm.debug=6 is attached. It was taken from v4.0-rc1
with a patch mentioned above on top of that. The same is for clean v4.0-rc1.
--
Best regards,
Andrey Skvortsov
Secure e-mail with gnupg: See http://www.gnupg.org/
PGP Key ID: 0x57A3AEAD
View attachment "dmesg-4.0-rc1-crash.log" of type "text/plain" (200323 bytes)
Download attachment "signature.asc" of type "application/pgp-signature" (820 bytes)
Powered by blists - more mailing lists