linux-kernel - Re: [REGRESSION] TI SN65DSI83 is being reset making display to blink On/Off

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b210003c-4939-4d88-8421-c5e53bd2ad9e@ideasonboard.com>
Date: Wed, 19 Nov 2025 11:39:30 +0200
From: Tomi Valkeinen <tomi.valkeinen@...asonboard.com>
To: Maxime Ripard <mripard@...nel.org>,
 Herve Codina <herve.codina@...tlin.com>
Cc: Luca Ceresoli <luca.ceresoli@...tlin.com>,
 Francesco Dolcini <francesco@...cini.it>,
 João Paulo Gonçalves
 <jpaulo.silvagoncalves@...il.com>, Andrzej Hajda <andrzej.hajda@...el.com>,
 Neil Armstrong <neil.armstrong@...aro.org>, Robert Foss <rfoss@...nel.org>,
 Laurent Pinchart <Laurent.pinchart@...asonboard.com>,
 Jonas Karlman <jonas@...boo.se>, Jernej Skrabec <jernej.skrabec@...il.com>,
 Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
 Thomas Zimmermann <tzimmermann@...e.de>, David Airlie <airlied@...il.com>,
 Simona Vetter <simona@...ll.ch>,
 João Paulo Gonçalves <joao.goncalves@...adex.com>,
 linux-kernel@...r.kernel.org, regressions@...ts.linux.dev,
 thomas.petazzoni@...tlin.com
Subject: Re: [REGRESSION] TI SN65DSI83 is being reset making display to blink
 On/Off

Hi,

On 19/11/2025 10:40, Maxime Ripard wrote:
> On Wed, Nov 19, 2025 at 08:51:27AM +0100, Herve Codina wrote:
>> Hi Maxime,
>>
>> On Tue, 18 Nov 2025 17:56:36 +0100
>> Maxime Ripard <mripard@...nel.org> wrote:
>>
>>> On Mon, Nov 17, 2025 at 04:27:28PM +0100, Luca Ceresoli wrote:
>>>> On Thu Nov 13, 2025 at 10:19 AM CET, Francesco Dolcini wrote:  
>>>>> On Thu, Nov 13, 2025 at 08:49:50AM +0100, Herve Codina wrote:  
>>>>>> On Mon, 10 Nov 2025 16:03:51 -0300
>>>>>> João Paulo Gonçalves <jpaulo.silvagoncalves@...il.com> wrote:  
>>>>>>> After commit ad5c6ecef27e ("drm: bridge: ti-sn65dsi83: Add error
>>>>>>> recovery mechanism"), our DSI display stopped working correctly. The
>>>>>>> display internally uses a TI SN65DSI83 to convert DSI-to-LVDS, and with
>>>>>>> the change, it keeps blinking on and off because the bridge is being
>>>>>>> reset by the error recovery mechanism.
>>>>>>>
>>>>>>> Even before the change, it was possible to see the message below from
>>>>>>> the driver indicating that the bridge's internal PLL was not locked
>>>>>>> (register 0xE5, bit 0 in [1]):
>>>>>>>
>>>>>>> [ 11.198616] sn65dsi83 2-002c: Unexpected link status 0x01
>>>>>>>
>>>>>>> However, it was working. After the change, it stopped working. Masking
>>>>>>> the PLL error makes it work again:
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
>>>>>>> index 033c44326552..89a0a2ab45b1 100644
>>>>>>> --- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
>>>>>>> +++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
>>>>>>> @@ -429,7 +429,7 @@ static void sn65dsi83_handle_errors(struct sn65dsi83 *ctx)
>>>>>>>          */
>>>>>>>
>>>>>>>         ret = regmap_read(ctx->regmap, REG_IRQ_STAT, &irq_stat);
>>>>>>> -       if (ret || irq_stat) {
>>>>>>> +       if (ret || (irq_stat & ~REG_IRQ_STAT_CHA_PLL_UNLOCK)) {
>>>>>>>                 /*
>>>>>>>                  * IRQ acknowledged is not always possible (the bridge can be in
>>>>>>>                  * a state where it doesn't answer anymore). To prevent an
>>>>>>>
>>>>>>> Any suggestions on how to proceed here?
>>>>>>>
>>>>>>> #regzbot introduced: ad5c6ecef27e
>>>>>>>
>>>>>>> [1] https://www.ti.com/lit/ds/symlink/sn65dsi83.pdf
>>>>>>>  
>>>>>>
>>>>>> The PLL should be locked.
>>>>>>
>>>>>> Also in the datasheet, in 'Table 7-2. Initialization Sequence', the status
>>>>>> is checked at the end of the initialization sequence and the sequence has to
>>>>>> be done again when the status register value is not 0x00.
>>>>>>
>>>>>> Even before monitoring (irq or polling method), you have an issue with your PLL
>>>>>> mentioned with the "sn65dsi83 2-002c: Unexpected link status 0x01" message.
>>>>>>
>>>>>> I don't understand even how your panel can be correctly driven with the bridge
>>>>>> PLL unlock.  
>>>>>
>>>>> We'll try to figure out the reason and see what's the best path forward.
>>>>>
>>>>> Whatever was the reason it was working before, and it should stay
>>>>> working  
>>>>
>>>> I agree with Hervé that using the chip with an unlocked PLL looks dangerous
>>>> and totally out of spec. So I encourage you to investigate what is going on
>>>> in the hardware looking for the root cause, checking whether the PLL is
>>>> really unlocked and how to get it working properly.
>>>>
>>>> The driver should be definitely be written focusing on the nominal case and
>>>> handle out-of-spec cases as an exception, not the other way around.
>>>>
>>>> I also agree your hardware should not stop working when upgrading to a new
>>>> kernel, so this investigation would ideally nail down tha root cause and
>>>> point to a solution in a very short time.
>>>>
>>>> Hervé has matured quite some experience on SN65DSI84 error management,
>>>> leading to his error recovery patch, and I also have a board with that chip
>>>> on my desk. So we may be helpful in discussion, as well as reviewing and
>>>> testing patches.  
>>>
>>> I'd say we should do it the other way around. If that patch breaks
>>> systems that were working fine so far without a clear reason, we should
>>> revert the offending commit, and *then* work towards a solution to
>>> support error recovery that doesn't break that system.
>>>
>>
>> I have the feeling that the broken system has an issue from the beginning.
>> Why its PLL has been unlocked ?
>>
>> I would like to understand what happens but, of course, I don't have the
>> hardware to investigate.
>>
>> Could the issue been on a component before the SN65DSI83 bridge?
>> I mean the component in charge of generating the DSI clock can be a culprit.
> 
> I understand what you're saying, but it's not the right way to think about it.
> 
> Let's change perspective.
> 
> Your work laptop just got a kernel upgrade, and its display doesn't work
> anymore. Would you be happy with the answer "it was broken all along, we
> might be able to help you fix it, maybe not, who knows when we'll have a
> fix"?
> 
> It's frustrating, you might not even be able to debug it in the first
> place, and most importantly, broken or not, it used to work just fine.
> 
> If it used to work for years, how can you possibly argue that it was
> broken all along?
I agree.

We could either revert the error handling, or change it to a print
(dev_dbg? but will it flood-print then if the unlock irq is being raised
all the time) instead of doing a reset.

 Tomi