linux-kernel - Re: [REGRESSION]: hibernate/sleep regression w/ bisection

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20120425200402.GB22273@zeus>
Date:	Wed, 25 Apr 2012 15:04:34 -0500
From:	Andrew Watts <akwatts@...il.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	Dmitry Torokhov <dmitry.torokhov@...il.com>,
	linux-kernel@...r.kernel.org, linux-pm@...ts.linux-foundation.org,
	David Airlie <airlied@...ux.ie>,
	dri-devel@...ts.freedesktop.org
Subject: Re: [REGRESSION]: hibernate/sleep regression w/ bisection

Hello all.

I wanted to follow-up on a very peculiar yet highly reproducible bug
involving suspend/resume, radeon, and a seemingly unrelated patch to
input serio.

The last comment in the thread was from Jerome Glisse saying it was a tough
bug to fix. Has anyone had any good ideas on how to fix it since?

Thanks!

~ Andy


On Fri, Nov 04, 2011 at 09:14:31AM -0700, Tejun Heo wrote:
> (cc'ing David Airlie and dri-devel)
> 
> Hello, the original thread can be read from
> 
>   http://thread.gmane.org/gmane.linux.kernel/1209587
> 
> Full sysrq-t output at
> 
>   http://article.gmane.org/gmane.linux.kernel/1211256
> 
> So, the problem is that after a seemingly unreated update to input
> serio driver (convert to use workqueue), X seems to lock up
> sporadically across suspend/resume cycles.
> 
> I went through the full sysrq-t output but couldn't spot anything
> suspicious w/ anything else.  No worker is stuck and nobody is waiting
> for flush to finish.
> 
> Stack trace for X follows.
> 
> > X               S f499b944  5800  1652   1651 0x00400080
> >  f499b9a8 00003086 00000000 f499b944 c100d4a4 00000000 00000000 f499b958
> >  00000000 f499b9a8 f5173140 d7857c56 00000057 f5173140 d8b69880 00000057
> >  00000001 00000000 f499b9b4 c104dd89 000f4240 00000000 00000000 f499ba68
> > Call Trace:
> >  [<c1291301>] ttm_bo_wait_unreserved+0x5f/0x106
> >  [<c129145f>] ttm_bo_reserve_locked+0xb7/0xe1
> >  [<c1292c27>] ttm_bo_reserve+0x26/0x95
> >  [<c12c3c97>] radeon_crtc_do_set_base+0xbd/0x6d2
> >  [<c12c42e7>] radeon_crtc_set_base+0x1b/0x1d
> >  [<c12c430d>] radeon_crtc_mode_set+0x24/0xdd7
> >  [<c1279c57>] drm_crtc_helper_set_mode+0x32c/0x48b
> >  [<c1279e2f>] drm_helper_resume_force_mode+0x79/0x23e
> >  [<c12ace10>] radeon_gpu_reset+0x84/0x98
> >  [<c12c0838>] radeon_fence_wait+0x2d1/0x311
> >  [<c12c0e37>] radeon_sync_obj_wait+0xc/0xe
> >  [<c12908be>] ttm_bo_wait+0xa1/0x108
> >  [<c12d6e7b>] radeon_gem_wait_idle_ioctl+0x76/0xc4
> >  [<c127e62e>] drm_ioctl+0x1c2/0x42c
> >  [<c10e288e>] do_vfs_ioctl+0x79/0x54b
> >  [<c10e2dcb>] sys_ioctl+0x6b/0x70
> >  [<c1593813>] sysenter_do_call+0x12/0x22
> 
> Do you guys have any ideas what's going on?  It seems to be waiting
> for bo->reserved to go zero.  Is it possible that someone there is
> forgetting to properly kick a work item after resume causing the wait
> to stall?
> 
> Andrew, can you please kill the X server after the hang and see
> whether that brings the system back?  I think sshd should still work
> and if not you can write a script to kill the X server after 30secs
> after resume (and kill that script if resume succeeds).
> 
> Thank you.
> 
> -- 
> tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/