[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <849307$b7q280@azsmga001.ch.intel.com>
Date: Thu, 20 Jan 2011 10:25:40 +0000
From: Chris Wilson <chris@...is-wilson.co.uk>
To: Linus Torvalds <torvalds@...ux-foundation.org>,
Jeff Chua <jeff.chua.linux@...il.com>
Cc: Len Brown <len.brown@...el.com>, "Rafael J. Wysocki" <rjw@...k.pl>,
Jesse Barnes <jbarnes@...tuousgeek.org>,
Dave Airlie <airlied@...ux.ie>, linux-kernel@...r.kernel.org,
DRI mailing list <dri-devel@...ts.freedesktop.org>
Subject: Re: more intel drm issues (was Re: [git pull] drm intel only fixes)
On Wed, 19 Jan 2011 22:22:48 -0800, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> On Wed, Jan 19, 2011 at 8:55 PM, Jeff Chua <jeff.chua.linux@...il.com> wrote:
> >
> > Rafael send out two patches earlier. Could be related. I was facing
> > issue during resume.
>
> No, I'm aware of the rcu-synchronize thing, this isn't it. This is
> really at the suspend stage, and I had bisected it down to the drm
> changes.
>
> In fact, by now I have bisected it down to a single commit. It's
> another merge commit, which makes me a bit nervous (I bisected another
> issue today, and it turned out to simply not be repeatable).
>
> But this time the merge commit actually has a real conflict that got
> fixed up in the merge, and the code around the conflict waits for
> three seconds, and three seconds is also exactly how long the delay at
> suspend time is. So I get the feeling that this time it's a real
> issue, and what happened was that the merge may have been a mismerge.
>
> Chris: as of commit 8d5203ca6253 ("Merge branch 'drm-intel-fixes' into
> drm-intel-next") I'm getting that 3-second delay at suspend time. And
> the merge diff looks like this:
>
> + struct drm_device *dev = ring->dev;
> + struct drm_i915_private *dev_priv = dev->dev_private;
> unsigned long end;
> - drm_i915_private_t *dev_priv = dev->dev_private;
> u32 head;
>
> - head = intel_read_status_page(ring, 4);
> - if (head) {
> - ring->head = head & HEAD_ADDR;
> - ring->space = ring->head - (ring->tail + 8);
> - if (ring->space < 0)
> - ring->space += ring->size;
> - if (ring->space >= n)
> - return 0;
> - }
> -
> trace_i915_ring_wait_begin (dev);
> end = jiffies + 3 * HZ;
> do {
>
> and that whole do-loop with a 3-second timeout makes me *very*
> suspicious. It used to have (in _one_ of the parent branches) that
> code before it to return early if there was space in the ring, now it
> doesn't any more - and that merge co-incides with my suspend suddenly
> taking 3 seconds.
>
> The same check that is deleted does exist inside the loop too, but
> there it has some extra code it in (compare to "actual_head" and so
> on), so I wonder if the fast-case was somehow hiding this issue.
Right, the autoreported HEAD may have been already reset to 0 and so hit
the wraparound bug which caused it to exit early without actually
quiescing the ringbuffer.
Another possibility is that I added a 3s timeout waiting for a request if
IRQs were suspended:
commit b5ba177d8d71f011c23b1cabec99fdaddae65c4d
Author: Chris Wilson <chris@...is-wilson.co.uk>
Date: Tue Dec 14 12:17:15 2010 +0000
drm/i915: Poll for seqno completion if IRQ is disabled
Both of those I think are symptoms of another problem, that perhaps during
suspend we are shutting down parts of the chip before idling?
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists