lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200722120502.GK6419@phenom.ffwll.local>
Date:   Wed, 22 Jul 2020 14:05:02 +0200
From:   daniel@...ll.ch
To:     unlisted-recipients:; (no To-header on input)
Cc:     Rodrigo Siqueira <rodrigosiqueiramelo@...il.com>,
        Haneen Mohammed <hamohammed.sa@...il.com>,
        Daniel Vetter <daniel@...ll.ch>,
        David Airlie <airlied@...ux.ie>,
        Rodrigo Siqueira <Rodrigo.Siqueira@....com>,
        dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
        kernel-usp@...glegroups.com, twoerner@...il.com
Subject: Re: [PATCH] drm/vkms: add missing drm_crtc_vblank_put to the get/put
 pair on flush

On Wed, Jul 22, 2020 at 08:04:11AM -0300, Melissa Wen wrote:
> This patch adds a missing drm_crtc_vblank_put op to the pair
> drm_crtc_vblank_get/put (inc/decrement counter to guarantee vblanks).
> 
> It clears the execution of the following kms_cursor_crc subtests:
> 1. pipe-A-cursor-[size,alpha-opaque, NxN-(on-screen, off-screen, sliding,
>    random, fast-moving])] - successful when running individually.
> 2. pipe-A-cursor-dpms passes again
> 3. pipe-A-cursor-suspend also passes
> 
> The issue was initially tracked in the sequential execution of IGT
> kms_cursor_crc subtest: when running the test sequence or one of its
> subtests twice, the odd execs complete and the pairs get stuck in an
> endless wait. In the IGT code, calling a wait_for_vblank before the start
> of CRC capture prevented the busy-wait. But the problem persisted in the
> pipe-A-cursor-dpms and -suspend subtests.
> 
> Checking the history, the pipe-A-cursor-dpms subtest was successful when,
> in vkms_atomic_commit_tail, instead of using the flip_done op, it used
> wait_for_vblanks. Another way to prevent blocking was wait_one_vblank when
> enabling crtc. However, in both cases, pipe-A-cursor-suspend persisted
> blocking in the 2nd start of CRC capture, which may indicate that
> something got stuck in the step of CRC setup. Indeed, wait_one_vblank in
> the crc setup was able to sync things and free all kms_cursor_crc
> subtests.
> 
> Tracing and comparing a clean run with a blocked one:
> - in a clean one, vkms_crtc_atomic_flush enables vblanks;
> - when blocked, only in next op, vkms_crtc_atomic_enable, the vblanks
> started. Moreover, a series of vkms_vblank_simulate flow out until
> disabling vblanks.
> Also watching the steps of vkms_crtc_atomic_flush, when the very first
> drm_crtc_vblank_get returned an error, the subtest crashed. On the other
> hand, when vblank_get succeeded, the subtest completed. Finally, checking
> the flush steps: it increases counter to hold a vblank reference (get),
> but there isn't a op to decreased it and release vblanks (put).
> 
> Cc: Daniel Vetter <daniel@...ll.ch>
> Cc: Rodrigo Siqueira <rodrigosiqueiramelo@...il.com>
> Cc: Haneen Mohammed <hamohammed.sa@...il.com>
> Signed-off-by: Melissa Wen <melissa.srw@...il.com>
> ---
>  drivers/gpu/drm/vkms/vkms_crtc.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> index ac85e17428f8..a99d6b4a92dd 100644
> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> @@ -246,6 +246,7 @@ static void vkms_crtc_atomic_flush(struct drm_crtc *crtc,
>  
>  		spin_unlock(&crtc->dev->event_lock);
>  
> +		drm_crtc_vblank_put(crtc);

Uh so I reviewed this a bit more carefully now, and I dont think this is
the correct bugfix. From the kerneldoc of drm_crtc_arm_vblank_event():

 * Caller must hold a vblank reference for the event @e acquired by a
 * drm_crtc_vblank_get(), which will be dropped when the next vblank arrives.

So when we call drm_crtc_arm_vblank_event then the vblank_put gets called
for us. And that's the only case where we successfully acquired a vblank
interrupt reference since on failure of drm_crtc_vblank_get (0 indicates
success for that function, failure negative error number) we directly send
out the event.

So something else fishy is going on, and now I'm totally confused why this
even happens.

We also have a pile of WARN_ON checks in drm_crtc_vblank_put to make sure
we don't underflow the refcount, so it's also not that I think (except if
this patch creates more WARNING backtraces).

But clearly it changes behaviour somehow ... can you try to figure out
what changes? Maybe print out the vblank->refcount at various points in
the driver, and maybe also trace when exactly the fake vkms vblank hrtimer
is enabled/disabled ...

I'm totally confused about what's going on here now.
-Daniel

>  		crtc->state->event = NULL;
>  	}
>  
> -- 
> 2.27.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ