[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200722140604.27dfzfnzug5vb75r@smtp.gmail.com>
Date: Wed, 22 Jul 2020 11:06:04 -0300
From: Melissa Wen <melissa.srw@...il.com>
To: Daniel Vetter <daniel@...ll.ch>
Cc: Rodrigo Siqueira <rodrigosiqueiramelo@...il.com>,
Haneen Mohammed <hamohammed.sa@...il.com>,
David Airlie <airlied@...ux.ie>,
Rodrigo Siqueira <Rodrigo.Siqueira@....com>,
dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
kernel-usp@...glegroups.com, twoerner@...il.com
Subject: Re: [PATCH] drm/vkms: add missing drm_crtc_vblank_put to the get/put
pair on flush
On 07/22, daniel@...ll.ch wrote:
> On Wed, Jul 22, 2020 at 08:04:11AM -0300, Melissa Wen wrote:
> > This patch adds a missing drm_crtc_vblank_put op to the pair
> > drm_crtc_vblank_get/put (inc/decrement counter to guarantee vblanks).
> >
> > It clears the execution of the following kms_cursor_crc subtests:
> > 1. pipe-A-cursor-[size,alpha-opaque, NxN-(on-screen, off-screen, sliding,
> > random, fast-moving])] - successful when running individually.
> > 2. pipe-A-cursor-dpms passes again
> > 3. pipe-A-cursor-suspend also passes
> >
> > The issue was initially tracked in the sequential execution of IGT
> > kms_cursor_crc subtest: when running the test sequence or one of its
> > subtests twice, the odd execs complete and the pairs get stuck in an
> > endless wait. In the IGT code, calling a wait_for_vblank before the start
> > of CRC capture prevented the busy-wait. But the problem persisted in the
> > pipe-A-cursor-dpms and -suspend subtests.
> >
> > Checking the history, the pipe-A-cursor-dpms subtest was successful when,
> > in vkms_atomic_commit_tail, instead of using the flip_done op, it used
> > wait_for_vblanks. Another way to prevent blocking was wait_one_vblank when
> > enabling crtc. However, in both cases, pipe-A-cursor-suspend persisted
> > blocking in the 2nd start of CRC capture, which may indicate that
> > something got stuck in the step of CRC setup. Indeed, wait_one_vblank in
> > the crc setup was able to sync things and free all kms_cursor_crc
> > subtests.
> >
> > Tracing and comparing a clean run with a blocked one:
> > - in a clean one, vkms_crtc_atomic_flush enables vblanks;
> > - when blocked, only in next op, vkms_crtc_atomic_enable, the vblanks
> > started. Moreover, a series of vkms_vblank_simulate flow out until
> > disabling vblanks.
> > Also watching the steps of vkms_crtc_atomic_flush, when the very first
> > drm_crtc_vblank_get returned an error, the subtest crashed. On the other
> > hand, when vblank_get succeeded, the subtest completed. Finally, checking
> > the flush steps: it increases counter to hold a vblank reference (get),
> > but there isn't a op to decreased it and release vblanks (put).
> >
> > Cc: Daniel Vetter <daniel@...ll.ch>
> > Cc: Rodrigo Siqueira <rodrigosiqueiramelo@...il.com>
> > Cc: Haneen Mohammed <hamohammed.sa@...il.com>
> > Signed-off-by: Melissa Wen <melissa.srw@...il.com>
> > ---
> > drivers/gpu/drm/vkms/vkms_crtc.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> > index ac85e17428f8..a99d6b4a92dd 100644
> > --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> > @@ -246,6 +246,7 @@ static void vkms_crtc_atomic_flush(struct drm_crtc *crtc,
> >
> > spin_unlock(&crtc->dev->event_lock);
> >
> > + drm_crtc_vblank_put(crtc);
>
> Uh so I reviewed this a bit more carefully now, and I dont think this is
> the correct bugfix. From the kerneldoc of drm_crtc_arm_vblank_event():
>
> * Caller must hold a vblank reference for the event @e acquired by a
> * drm_crtc_vblank_get(), which will be dropped when the next vblank arrives.
>
> So when we call drm_crtc_arm_vblank_event then the vblank_put gets called
> for us. And that's the only case where we successfully acquired a vblank
> interrupt reference since on failure of drm_crtc_vblank_get (0 indicates
> success for that function, failure negative error number) we directly send
> out the event.
>
> So something else fishy is going on, and now I'm totally confused why this
> even happens.
>
> We also have a pile of WARN_ON checks in drm_crtc_vblank_put to make sure
> we don't underflow the refcount, so it's also not that I think (except if
> this patch creates more WARNING backtraces).
>
> But clearly it changes behaviour somehow ... can you try to figure out
> what changes? Maybe print out the vblank->refcount at various points in
> the driver, and maybe also trace when exactly the fake vkms vblank hrtimer
> is enabled/disabled ...
:(
I can check these, but I also have other suspicions. When I place the
drm_crct_vblank_put out of the if (at the end of flush), it not only solve
the issue of blocking on kms_cursor_crc, but also the WARN_ON on kms_flip
doesn't appear anymore (a total cleanup). Just after:
vkms_output->composer_state = to_vkms_crtc_state(crtc->state);
looks like there is something stuck around here.
Besides, there is a lock at atomic_begin:
/* This lock is held across the atomic commit to block vblank timer
* from scheduling vkms_composer_worker until the composer is updated
*/
spin_lock_irq(&vkms_output->lock);
that seems to be released on atomic_flush and make me suspect something
missing on the composer update.
I'll check all these things and come back with news (hope) :)
Thanks,
Melissa
>
> I'm totally confused about what's going on here now.
> -Daniel
>
> > crtc->state->event = NULL;
> > }
> >
> > --
> > 2.27.0
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
Powered by blists - more mailing lists