lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 25 Jan 2016 13:15:45 +0900
From:	Michel Dänzer <michel@...nzer.net>
To:	Ville Syrjälä <ville.syrjala@...ux.intel.com>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	dri-devel@...ts.freedesktop.org,
	Alex Deucher <alexander.deucher@....com>,
	Christian König <christian.koenig@....com>,
	Vlastimil Babka <vbabka@...e.cz>
Subject: Re: linux-4.4 bisected: kwin5 stuck on kde5 loading screen with
 radeon

On 23.01.2016 00:18, Ville Syrjälä wrote:
> On Fri, Jan 22, 2016 at 12:06:00PM +0900, Michel Dänzer wrote:
>>
>> [ Trimming KDE folks from Cc ]
>>
>> On 21.01.2016 19:09, Daniel Vetter wrote:
>>> On Thu, Jan 21, 2016 at 05:36:46PM +0900, Michel Dänzer wrote:
>>>> On 21.01.2016 16:58, Daniel Vetter wrote:
>>>>>
>>>>> Can you please point me at the vblank on/off jump bug please?
>>>>
>>>> AFAIR I originally reported it in response to
>>>> http://lists.freedesktop.org/archives/dri-devel/2015-August/087841.html
>>>> , but I can't find that in the archives, so maybe that was just on IRC.
>>>> See
>>>> http://lists.freedesktop.org/archives/dri-devel/2016-January/099122.html
>>>> . Basically, I ran into the bug fixed by your patch because the counter
>>>> jumped forward on every DPMS off, so it hit the 32-bit boundary after
>>>> just a few days.
>>>
>>> Ok, so just uncovered the overflow bug.
>>
>> Not sure what you mean by "just", but to be clear: The drm_vblank_on/off
>> counter jumping bug (similar to the bug this thread is about), which
>> exposed the overflow bug, is still alive and kicking in 4.5. It seems
>> to happen when turning off the CRTC:
>>
>> [drm:drm_update_vblank_count] updating vblank count on crtc 0: current=218104694, diff=0, hw=916 hw_last=916
>> [drm:radeon_get_vblank_counter_kms] crtc 0: dist from vblank start 3
>> [drm:drm_calc_vbltimestamp_from_scanoutpos] crtc 0 : v 0x7 p(2199,-45)@ 7304.307354 -> 7304.308006 [e 0 us, 0 rep]
>> [drm:radeon_get_vblank_counter_kms] crtc 0: dist from vblank start 3
>> [drm:drm_update_vblank_count] updating vblank count on crtc 0: current=218104694, diff=16776301, hw=1 hw_last=916
> 
> Not sure what bug we're talking about here, but here the hw counter
> clearly jumps backwards.
> 
>> [drm:radeon_get_vblank_counter_kms] Query failed! stat 3
>> [drm:radeon_get_vblank_counter_kms] Query failed! stat 3
>> [drm:drm_update_vblank_count] updating vblank count on crtc 1: current=0, diff=0, hw=0 hw_last=0
>> [drm:radeon_get_vblank_counter_kms] Query failed! stat 3
>> [drm:radeon_get_vblank_counter_kms] Query failed! stat 3
>> [drm:drm_update_vblank_count] updating vblank count on crtc 2: current=0, diff=0, hw=0 hw_last=0
>> [drm:radeon_get_vblank_counter_kms] Query failed! stat 3
>> [drm:radeon_get_vblank_counter_kms] Query failed! stat 3
>> [drm:drm_update_vblank_count] updating vblank count on crtc 3: current=0, diff=0, hw=0 hw_last=0
>> [drm:radeon_get_vblank_counter_kms] Query failed! stat 1
>> [drm:drm_calc_vbltimestamp_from_scanoutpos] crtc 0 : v 0x1 p(0,0)@ 7304.317140 -> 7304.317140 [e 0 us, 0 rep]
>> [drm:radeon_get_vblank_counter_kms] Query failed! stat 1
>> [drm:drm_update_vblank_count] updating vblank count on crtc 0: current=234880995, diff=16777215, hw=0 hw_last=1
> 
> Same here.

At least one of the jumps is expected, because this is around turning
off the CRTC for DPMS off. Don't know yet why there are two jumps back
though.


> These things just don't happen on i915 because drm_vblank_off() and
> drm_vblank_on() are always called around the times when the hw counter
> might get reset. Or at least that's how it should be.

Which is of course the idea of Daniel's patch (which is what I'm getting
the above with) or Mario's patch as well, but clearly something's still
wrong. It's certainly possible that it's something in the driver, but
since calling drm_vblank_pre/post_modeset from the same places seems to
work fine (ignoring the regression discussed in this thread)... Do
drm_vblank_on/off require something else to handle this correctly?


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ