linux-kernel - Re: more intel drm issues (was Re: [git pull] drm intel only fixes)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <AANLkTi=AEhztUp30A0o5YxHCqqLg38+f5s+mXfbZTX8Y@mail.gmail.com>
Date:	Thu, 20 Jan 2011 08:07:02 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Chris Wilson <chris@...is-wilson.co.uk>
Cc:	Jeff Chua <jeff.chua.linux@...il.com>,
	Len Brown <len.brown@...el.com>,
	"Rafael J. Wysocki" <rjw@...k.pl>,
	Jesse Barnes <jbarnes@...tuousgeek.org>,
	Dave Airlie <airlied@...ux.ie>, linux-kernel@...r.kernel.org,
	DRI mailing list <dri-devel@...ts.freedesktop.org>
Subject: Re: more intel drm issues (was Re: [git pull] drm intel only fixes)

On Thu, Jan 20, 2011 at 2:25 AM, Chris Wilson <chris@...is-wilson.co.uk> wrote:
>
> Right, the autoreported HEAD may have been already reset to 0 and so hit
> the wraparound bug which caused it to exit early without actually
> quiescing the ringbuffer.

Yeah, that would explain the issue.

> Another possibility is that I added a 3s timeout waiting for a request if
> IRQs were suspended:

No, if IRQ's are actually suspended here, then that codepath is
totally buggy and would blow up (msleep() doesn't work, and jiffies
wouldn't advance on UP). So that's not it.

> Both of those I think are symptoms of another problem, that perhaps during
> suspend we are shutting down parts of the chip before idling?

That could be, but looking at the code, one thing strikes me: the
_normal_ case (of just waiting for "enough space" in the ring buffer)
doesn't need to use the exact case, but the "wait for ring buffer to
be totally empty" does.

Which means that the use of the "fast-but-inaccurate" 'head' sounds
wrong for the "wait for idle" case.

So can you explain the difference between

   intel_read_status_page(ring, 4);

vs

   I915_READ_HEAD(ring);

because from looking at the code, I get the notion that
"intel_read_status_page()" may not be exact. But what happens if that
inexact value matches our cached ring->actual_head, so we never even
try to read the exact case? Does it _stay_ inexact for arbitrarily
long times? If so, we might wait for the ring to empty forever (well,
until the timeout - the behavior I see), even though the ring really
_is_ empty. No?

Also, isn't that "head < ring->actual_head" buggy? What about the
overflow case? Not that we care, because afaik, 'actual_head' is not
actually used anywhere, so it should be called 'pointless_head'?

That code looks suspiciously bogus.

                    Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/