lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 5 Nov 2019 08:23:27 -0800
From:   Rob Clark <robdclark@...il.com>
To:     Brian Masney <masneyb@...tation.org>
Cc:     Rob Clark <robdclark@...omium.org>,
        freedreno <freedreno@...ts.freedesktop.org>,
        Sean Paul <sean@...rly.run>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        dri-devel <dri-devel@...ts.freedesktop.org>,
        linux-arm-msm <linux-arm-msm@...r.kernel.org>
Subject: Re: [Freedreno] drm/msm: 'pp done time out' errors after async commit changes

On Tue, Nov 5, 2019 at 2:08 AM Brian Masney <masneyb@...tation.org> wrote:
>
> On Mon, Nov 04, 2019 at 04:19:07PM -0800, Rob Clark wrote:
> > On Mon, Nov 4, 2019 at 4:01 PM Brian Masney <masneyb@...tation.org> wrote:
> > >
> > > Hey Rob,
> > >
> > > Since commit 2d99ced787e3 ("drm/msm: async commit support"), the frame
> > > buffer console on my Nexus 5 began throwing these errors:
> > >
> > > msm fd900000.mdss: pp done time out, lm=0
> > >
> > > The display still works.
> > >
> > > I see that mdp5_flush_commit() was introduced in commit 9f6b65642bd2
> > > ("drm/msm: add kms->flush_commit()") with a TODO comment and the commit
> > > description mentions flushing registers. I assume that this is the
> > > proper fix. If so, can you point me to where these registers are
> > > defined and I can work on the mdp5 implementation.
> >
> > See mdp5_ctl_commit(), which writes the CTL_FLUSH registers.. the idea
> > would be to defer writing CTL_FLUSH[ctl_id] = flush_mask until
> > kms->flush() (which happens from a timer shortly before vblank).
> >
> > But I think the async flush case should not come up with fbcon?  It
> > was really added to cope with hwcursor updates (and userspace that
> > assumes it can do an unlimited # of cursor updates per frame).. the
> > intention was that nothing should change in the sequence for mdp5 (but
> > I guess that was not the case).
>
> The 'pp done time out' errors go away if I revert the following three
> commits:
>
> cd6d923167b1 ("drm/msm/dpu: async commit support")
> d934a712c5e6 ("drm/msm: add atomic traces")
> 2d99ced787e3 ("drm/msm: async commit support")
>
> I reverted the first one to fix a compiler error, and the second one so
> that the last patch can be reverted without any merge conflicts.
>
> I see that crtc_flush() calls mdp5_ctl_commit(). I tried to use
> crtc_flush_all() in mdp5_flush_commit() and the contents of the frame
> buffer dance around the screen like its out of sync. I renamed
> crtc_flush_all() to mdp5_crtc_flush_all() and removed the static
> declaration. Here's the relevant part of what I tried:
>
> --- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> +++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c
> @@ -171,7 +171,15 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
>
>  static void mdp5_flush_commit(struct msm_kms *kms, unsigned crtc_mask)
>  {
> -       /* TODO */
> +       struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
> +       struct drm_crtc *crtc;
> +
> +       for_each_crtc_mask(mdp5_kms->dev, crtc, crtc_mask) {
> +               if (!crtc->state->active)
> +                       continue;
> +
> +               mdp5_crtc_flush_all(crtc);
> +       }
>  }
>
> Any tips would be appreciated.


I think this is along the lines of what we need to enable async commit
for mdp5 (but also removing the flush from the atomic-commit path)..
the principle behind the async commit is to do all the atomic state
commit normally, but defer writing the flush bits.  This way, if you
get another async update before the next vblank, you just apply it
immediately instead of waiting for vblank.

But I guess you are on a command mode panel, if I remember?  Which is
a case I didn't have a way to test.  And I'm not entirely about how
kms_funcs->vsync_time() should be implemented for cmd mode panels.

That all said, I think we should first fix what is broken, before
worrying about extending async commit support to mdp5.. which
shouldn't hit the async==true path, due to not implementing
kms_funcs->vsync_time().

What I think is going on is that, in the cmd mode case,
mdp5_wait_flush() (indirectly) calls mdp5_crtc_wait_for_pp_done(),
which waits for a pp-done irq regardless of whether there is a flush
in progress.  Since there is no flush pending, the irq never comes.
But the expectation is that kms_funcs->wait_flush() returns
immediately if there is nothing to wait for.

BR,
-R

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ