lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 16 Jun 2020 21:09:44 +0200
From:   Stefan Wahren <stefan.wahren@...e.com>
To:     Maxime Ripard <maxime@...no.tech>
Cc:     Eric Anholt <eric@...olt.net>,
        Tim Gover <tim.gover@...pberrypi.com>,
        Dave Stevenson <dave.stevenson@...pberrypi.com>,
        linux-kernel@...r.kernel.org,
        DRI Development <dri-devel@...ts.freedesktop.org>,
        bcm-kernel-feedback-list@...adcom.com,
        Nicolas Saenz Julienne <nsaenzjulienne@...e.de>,
        Phil Elwell <phil@...pberrypi.com>,
        linux-arm-kernel@...ts.infradead.org,
        linux-rpi-kernel@...ts.infradead.org
Subject: Re: [PATCH v3 070/105] drm/vc4: hdmi: rework connectors and encoders

Hi Maxime,

Am 16.06.20 um 14:30 schrieb Maxime Ripard:
> On Sun, Jun 14, 2020 at 06:16:56PM +0200, Stefan Wahren wrote:
>> Am 11.06.20 um 15:34 schrieb Maxime Ripard:
>>> Hi Stefan,
>>>
>>> On Sat, Jun 06, 2020 at 10:06:12AM +0200, Stefan Wahren wrote:
>>>> Hi Maxime,
>>>>
>>>> Am 05.06.20 um 16:35 schrieb Maxime Ripard:
>>>>> Hi Stefan,
>>>>>
>>>>> On Wed, Jun 03, 2020 at 07:32:30PM +0200, Stefan Wahren wrote:
>>>>>> Am 02.06.20 um 17:54 schrieb Maxime Ripard:
>>>>>> FWIW this is the first patch which breaks X on my Raspberry Pi 3 B.
>>>>>>
>>>>>> Here are the bisect results:
>>>>>>
>>>>>> 587d6e4a529a8d807a5c0bae583dd432d77064d6 bad (black screen, no heartbeat)
>>>>>>
>>>>>> b0523c7b1c9d0edcd6c0fe6d2cb558a9ad5c60a8 good
>>>>>>
>>>>>> 2c6a651cac6359cb0244a40d3b7a14e72918f169 good
>>>>>>
>>>>>> 1705c3cb40906863ec0d24ee5ea5092f5ee2e994 bad (black screen, but heartbeat)
>>>>>>
>>>>>> 601527fea6bb226abd088a864e74b25368218e87 good
>>>>>>
>>>>>> 2165607ede34d229d0cbce916c70c7fb6c0337be good
>>>>>>
>>>>>> f094f388fc2df848227e2ae648df2c97872df42b good
>>>>>>
>>>>>> 020de18840a1075b2671736c6cc2e451030fad74 bad (black screen, but heartbeat)
>>>>>>
>>>>>> 4c4da3823e4d1a8189e96a59a79451fff372f70b good
>>>>>>
>>>>>> 020de18840a1075b2671736c6cc2e451030fad74 is the first bad commit
>>>>>> commit 020de18840a1075b2671736c6cc2e451030fad74
>>>>>> Author: Maxime Ripard <maxime@...no.tech>
>>>>>> Date:   Mon Jan 6 17:17:29 2020 +0100
>>>>>>
>>>>>>     drm/vc4: hdmi: rework connectors and encoders
>>>>>>    
>>>>>>     the vc4_hdmi driver has some custom structures to hold the data it
>>>>>> needs to
>>>>>>     associate with the drm_encoder and drm_connector structures.
>>>>>>    
>>>>>>     However, it allocates them separately from the vc4_hdmi structure which
>>>>>>     makes it more complicated than it needs to be.
>>>>>>    
>>>>>>     Move those structures to be contained by vc4_hdmi and update the code
>>>>>>     accordingly.
>>>>>>    
>>>>>>     Signed-off-by: Maxime Ripard <maxime@...no.tech>
>>>>> So it looks like there was two issues on the Pi3. The first one was
>>>>> causing the timeouts (and therefore likely the black screen but
>>>>> heartbeat case you had) and I've fixed it.
>>>>>
>>>>> However, I can indeed reproduce the case with the black screen / no
>>>>> heartbeat you mentionned. My bisection however returns that it's the
>>>>> patch "drm/vc4: hdmi: Implement finer-grained hooks" that is at fault.
>>>>> I've pushed my updated branch, if you have some spare time, it would be
>>>>> great if you could confirm it on your Pi.
>>>> yesterday i checked out your latest rpi4-kms branch, but i was still
>>>> facing similiar issues with my Raspberry Pi 3 and multi_v7_defconfig
>>>> (heartbeat stops, splashscreen freeze, heartbeat is abnormal fast). So i
>>>> tried to bisect but the offending commit didn't cause an issue the
>>>> second time.
>>>>
>>>> By accident i noticed that a simple reboot seems to hang for at least 8
>>>> minutes (using b0523c7b1c9d0edcd the base of your branch). This usually
>>>> take a few seconds. So i consider this base on linux-next as too
>>>> unstable for reliable testing.
>>>>
>>>> Is it possible to rebase this on something more stable like linux-5.7 or
>>>> at least drm-misc-next? This should avoid chasing unrelated issues.
>>> I've rebased it on 5.7 here:
>>> https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux.git/log/?h=rpi4-kms-5.7
>>>
>>> And it looks to be indeed an issue coming from next. That branch can
>>> start the desktop just fine on an RPi3 here. It would be great if you
>>> could confirm on your end.
>>>
>>> Thanks!
>>> Maxime
>> thank you very much. The good news are that the "black screen, but
>> heartbeat" issue and reboot hang are gone. Unfortunately the "no
>> heartbeat" issue is still there.
>>
>> Here are more details about the issue. It doesn't occur everytime. I
>> would guess the probability is about 40 percent, which made bisecting
>> much harder.
> Are you sure about that 40% reliability?
it's more a gut feeling than a statistical analyze. It's definitely not
100% in my setup.
>  I found out that the culprit
> was that the commit we mentionned was actually running atomic_disable
> before our own custom callbacks, meaning that we would run the custom
> callbacks with the clocks and the power domain shut down, resulting in a
> stall.
>
> I was seeing it all the time when X was shutting down the display, but
> maybe you were changing the resolution between the framebuffer console
> or something, and since the power domain is shut down asynchronously, it
> wasn't running fast enough for the next enable to come up and re-enable
> it again?
>
>> It is reproducible on my 2 Raspberry Pi 3 B Rev 1.2. It is
>> also seems independent from the display because the problem occured on
>> my Computer display and my TV.
> But only on HDMI, right?
I only tested it with HDMI displays. All tests without any display were
always successful.
>
> I've pushed a new branch with that fix.

I tested 8 times in row without any issue. You got it.

Thanks
Stefan

>
> Maxime

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ