[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <EB643972-4918-4B89-B325-59D03648F2F9@tuebingen.mpg.de>
Date: Sat, 29 Jan 2011 03:59:38 +0100
From: Mario Kleiner <mario.kleiner@...bingen.mpg.de>
To: Hugh Dickins <hughd@...gle.com>
Cc: Chris Wilson <chris@...is-wilson.co.uk>,
Frederic Weisbecker <fweisbec@...il.com>,
linux-kernel@...r.kernel.org,
Daniel Vetter <daniel.vetter@...ll.ch>,
Arnd Bergmann <arnd@...db.de>, Jiri Olsa <jolsa@...hat.com>,
Chris Clayton <chris2553@...glemail.com>,
Mario Kleiner <mario.kleiner@...bingen.mpg.de>
Subject: Re: [PATCH] drm/i915,agp/intel: Do not clear stolen entries
On Jan 28, 2011, at 11:00 PM, Hugh Dickins wrote:
> Sorry, this is now abount vblank or scanout rather than stolen
> entries.
>
> On Mon, 24 Jan 2011, Chris Wilson wrote:
>> On Sun, 23 Jan 2011 23:40:41 -0800 (PST), Hugh Dickins
>> <hughd@...gle.com> wrote:
>>
>>> On this laptop I'm typing from (GM965 with KMS), I've had no trouble
>>> getting X up; but when typing in one of the xterms, typed characters
>>> often stop echoing, until I shift to a different window, whereupon
>>> they appear. This condition cleared (for a while) by switching to
>>> VESA fb console and back; no such problem observed on that console.
>>>
>>> Does that sound familiar? I have no evidence whatever that i915 is
>>> to blame here. Several times I tried bisecting last week, but each
>>> attempt ended up in a nonsensical place, because the effect does not
>>> occur to order. So I'd sometimes mark a bisection point as good
>>> when
>>> I guess it must actually have been bad. Perhaps it's a matter of
>>> timing or an uninitialized variable. But while I'm here, worth
>>> asking
>>> if that behaviour sounds like anything you might be responsible for?
>>
>> Sounds suspiciously like the batch buffer is not being dispatched and
>> flushed to the scanout. A very similar bug was recently fixed for
>> xf86-video-intel 2.14.0 which was causing deferred output.
>
> I made a more patient bisection during the week, on x86_64 which
> seemed more consistent than i386, and this time it converged sensibly:
> to commit 0af7e4dff50454905092d468e91c1ef92e10e6b4
> drm/i915: Add support for precise vblank timestamping (v2)
>
> Which kindly notes in its commit message:
> This code has been only tested on a HP-Mini Netbook with
> Atom processor and Intel 945GME gpu. The codepath for
> (IS_G4X(dev) || IS_GEN5(dev) || IS_GEN6(dev)) gpu's
> has not been tested so far due to lack of hardware.
> so not surprising that it doesn't work on GM965.
>
> I'm now running with this silly revert:
>
> --- a/drivers/gpu/drm/i915/i915_drv.c 2011-01-18 22:04:29.000000000
> -0800
> +++ b/drivers/gpu/drm/i915/i915_drv.c 2011-01-24 19:35:51.000000000
> -0800
> @@ -674,8 +674,8 @@ static struct drm_driver driver = {
> .device_is_agp = i915_driver_device_is_agp,
> .enable_vblank = i915_enable_vblank,
> .disable_vblank = i915_disable_vblank,
> - .get_vblank_timestamp = i915_get_vblank_timestamp,
> - .get_scanout_position = i915_get_crtc_scanoutpos,
> + .get_vblank_timestamp = NULL /* i915_get_vblank_timestamp */,
> + .get_scanout_position = NULL /* i915_get_crtc_scanoutpos */,
> .irq_preinstall = i915_driver_irq_preinstall,
> .irq_postinstall = i915_driver_irq_postinstall,
> .irq_uninstall = i915_driver_irq_uninstall,
>
> which makes 2.6.38-rc usable; though I do believe that I've seen
> the same issue (unflushed text) occur a couple of times since, much
> too rare to bisect or get upset by, but indicative of some
> remaining bug.
>
Hi,
just skimmed through the archives of this thread. Do i understand
correctly that the problem that gets fixed by your revert is that
<snip>
>>> when typing in one of the xterms, typed characters
>>> often stop echoing, until I shift to a different window, whereupon
>>> they appear. This condition cleared (for a while) by switching to
>>> VESA fb console and back; no such problem observed on that console.
>>
</snip>
Is this with desktop composition enabled? Do things like glxgears in
a window work correctly? If desktop composition is off?
For a softer fix to the problem you can revert your revert and
disable use of those functions by the drm core via:
echo 0 > /sys/modules/drm/parameters/timestamp_precision_usec
But can you run it with echo 7 > /sys/modules/drm/parameters/debug
and show me bits of the syslog output when the problem happens?
Especially output from the functions
"drm_calc_vbltimestamp_from_scanoutpos" and "drm_handle_vblank" and
maybe for "vblank_disable_fn", "drm_update_vblank_count", and
"drm_vblank_get".
Those functions (are supposed to) compute exact timestamps of start
of scanout after each vblank. If they get disabled via the "echo
0 ..." then a do_gettimeofday() is called for a crude approximation
of start of scanout. The computed timestamps are returned to clients
which want them (oml_sync_control extension). I doubt that many apps
use that extension or its timestamps already, especially not desktop
compositors etc., so i wouldn't expect trouble from such wrong
timestamps.
However, the timestamps are also used in drm_handle_vblank() in
drivers/gpu/drm/drm_irq.c at each vblank irq to detect and filter out
redundant vblank irq's to avoid miscounting of vblanks (observed on
some Radeon's). If the kms driver would deliver a grossly wrong
timestamp and something would be wrong in the implementation of that
filtering, it could happen that the vblank counter doesn't get
incremented -> delivery of a vblank event to the x-server gets
delayed -> a swapbuffer operation on a composited desktop gets
delayed -> content of a redirected window updates only with a delay.
The relevant check which could prevent vblank counter increments and
delay vblank event delivery to the x-server in drm_handle_vblank()
would be:
if (abs(diff_ns) > DRM_REDUNDANT_VBLIRQ_THRESH_NS) {
The condition should be satisfied if everything works correctly, but
also if timestamps would be grossly wrong, thereby leading to a
larger than 1 msec positive or negative diff_ns. s64 diff_ns is a
signed 64 bit integer. Could abs(diff_ns) somehow miscompute for
large 64 bit numbers?
All guesswork, the syslog output should tell us more if the
timestamping is really involved in the problem.
thanks,
-mario
*********************************************************************
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany
e-mail: mario.kleiner@...bingen.mpg.de
office: +49 (0)7071/601-1623
fax: +49 (0)7071/601-616
www: http://www.kyb.tuebingen.mpg.de/~kleinerm
*********************************************************************
"For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled."
(Richard Feynman)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists