lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <593e90d9-cf04-45a2-8172-98c441ec79f5@ideasonboard.com>
Date: Wed, 19 Nov 2025 14:09:04 +0200
From: Tomi Valkeinen <tomi.valkeinen@...asonboard.com>
To: Francesco Dolcini <francesco@...cini.it>,
 Luca Ceresoli <luca.ceresoli@...tlin.com>,
 Herve Codina <herve.codina@...tlin.com>
Cc: Maxime Ripard <mripard@...nel.org>,
 João Paulo Gonçalves
 <jpaulo.silvagoncalves@...il.com>, Andrzej Hajda <andrzej.hajda@...el.com>,
 Neil Armstrong <neil.armstrong@...aro.org>, Robert Foss <rfoss@...nel.org>,
 Laurent Pinchart <Laurent.pinchart@...asonboard.com>,
 Jonas Karlman <jonas@...boo.se>, Jernej Skrabec <jernej.skrabec@...il.com>,
 Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
 Thomas Zimmermann <tzimmermann@...e.de>, David Airlie <airlied@...il.com>,
 Simona Vetter <simona@...ll.ch>,
 João Paulo Gonçalves <joao.goncalves@...adex.com>,
 linux-kernel@...r.kernel.org, regressions@...ts.linux.dev,
 thomas.petazzoni@...tlin.com
Subject: Re: [REGRESSION] TI SN65DSI83 is being reset making display to blink
 On/Off

Hi,

On 19/11/2025 13:12, Francesco Dolcini wrote:
> Hello Luca, Herve
> 
> On Wed, Nov 19, 2025 at 11:08:23AM +0100, Luca Ceresoli wrote:
>>
>> On Wed Nov 19, 2025 at 9:40 AM CET, Maxime Ripard wrote:
>>> On Wed, Nov 19, 2025 at 08:51:27AM +0100, Herve Codina wrote:
>>>> On Tue, 18 Nov 2025 17:56:36 +0100
>>>> Maxime Ripard <mripard@...nel.org> wrote:
>>>>> On Mon, Nov 17, 2025 at 04:27:28PM +0100, Luca Ceresoli wrote:
>>>>>> On Thu Nov 13, 2025 at 10:19 AM CET, Francesco Dolcini wrote:
>>>>>>> On Thu, Nov 13, 2025 at 08:49:50AM +0100, Herve Codina wrote:
>>>>>>>> On Mon, 10 Nov 2025 16:03:51 -0300
>>>>>>>> João Paulo Gonçalves <jpaulo.silvagoncalves@...il.com> wrote:
>>>>>>>>> After commit ad5c6ecef27e ("drm: bridge: ti-sn65dsi83: Add error
>>>>>>>>> recovery mechanism"), our DSI display stopped working correctly. The
>>>>>>>>> display internally uses a TI SN65DSI83 to convert DSI-to-LVDS, and with
>>>>>>>>> the change, it keeps blinking on and off because the bridge is being
>>>>>>>>> reset by the error recovery mechanism.
>>>>>>>>>
>>>>>>>>> Even before the change, it was possible to see the message below from
>>>>>>>>> the driver indicating that the bridge's internal PLL was not locked
>>>>>>>>> (register 0xE5, bit 0 in [1]):
>>>>>>>>>
>>>>>>>>> [ 11.198616] sn65dsi83 2-002c: Unexpected link status 0x01
>>>>>>>>>
>>>>>>>>> However, it was working. After the change, it stopped working. Masking
>>>>>>>>> the PLL error makes it work again:
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
>>>>>>>>> index 033c44326552..89a0a2ab45b1 100644
>>>>>>>>> --- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
>>>>>>>>> +++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
>>>>>>>>> @@ -429,7 +429,7 @@ static void sn65dsi83_handle_errors(struct sn65dsi83 *ctx)
>>>>>>>>>          */
>>>>>>>>>
>>>>>>>>>         ret = regmap_read(ctx->regmap, REG_IRQ_STAT, &irq_stat);
>>>>>>>>> -       if (ret || irq_stat) {
>>>>>>>>> +       if (ret || (irq_stat & ~REG_IRQ_STAT_CHA_PLL_UNLOCK)) {
>>>>>>>>>                 /*
>>>>>>>>>                  * IRQ acknowledged is not always possible (the bridge can be in
>>>>>>>>>                  * a state where it doesn't answer anymore). To prevent an
>>>>>>>>>
>>>>>>>>> Any suggestions on how to proceed here?
>>>>>>>>>
>>>>>>>>> #regzbot introduced: ad5c6ecef27e
>>>>>>>>>
>>>>>>>>> [1] https://www.ti.com/lit/ds/symlink/sn65dsi83.pdf
>>>>>>>>>
>>>>>>>>
>>>>>>>> The PLL should be locked.
>>>>>>>>
>>>>>>>> Also in the datasheet, in 'Table 7-2. Initialization Sequence', the status
>>>>>>>> is checked at the end of the initialization sequence and the sequence has to
>>>>>>>> be done again when the status register value is not 0x00.
>>>>>>>>
>>>>>>>> Even before monitoring (irq or polling method), you have an issue with your PLL
>>>>>>>> mentioned with the "sn65dsi83 2-002c: Unexpected link status 0x01" message.
>>>>>>>>
>>>>>>>> I don't understand even how your panel can be correctly driven with the bridge
>>>>>>>> PLL unlock.
>>>>>>>
>>>>>>> We'll try to figure out the reason and see what's the best path forward.
>>>>>>>
>>>>>>> Whatever was the reason it was working before, and it should stay
>>>>>>> working
>>>>>>
>>>>>> I agree with Hervé that using the chip with an unlocked PLL looks dangerous
>>>>>> and totally out of spec. So I encourage you to investigate what is going on
>>>>>> in the hardware looking for the root cause, checking whether the PLL is
>>>>>> really unlocked and how to get it working properly.
>>>>>>
>>>>>> The driver should be definitely be written focusing on the nominal case and
>>>>>> handle out-of-spec cases as an exception, not the other way around.
>>>>>>
>>>>>> I also agree your hardware should not stop working when upgrading to a new
>>>>>> kernel, so this investigation would ideally nail down tha root cause and
>>>>>> point to a solution in a very short time.
>>>>>>
>>>>>> Hervé has matured quite some experience on SN65DSI84 error management,
>>>>>> leading to his error recovery patch, and I also have a board with that chip
>>>>>> on my desk. So we may be helpful in discussion, as well as reviewing and
>>>>>> testing patches.
>>>>>
>>>>> I'd say we should do it the other way around. If that patch breaks
>>>>> systems that were working fine so far without a clear reason, we should
>>>>> revert the offending commit, and *then* work towards a solution to
>>>>> support error recovery that doesn't break that system.
>>>>>
>>>>
>>>> I have the feeling that the broken system has an issue from the beginning.
>>>> Why its PLL has been unlocked ?
>>>>
>>>> I would like to understand what happens but, of course, I don't have the
>>>> hardware to investigate.
>>>>
>>>> Could the issue been on a component before the SN65DSI83 bridge?
>>>> I mean the component in charge of generating the DSI clock can be a culprit.
>>>
>>> I understand what you're saying, but it's not the right way to think about it.
>>>
>>> Let's change perspective.
>>>
>>> Your work laptop just got a kernel upgrade, and its display doesn't work
>>> anymore. Would you be happy with the answer "it was broken all along, we
>>> might be able to help you fix it, maybe not, who knows when we'll have a
>>> fix"?
>>>
>>> It's frustrating, you might not even be able to debug it in the first
>>> place, and most importantly, broken or not, it used to work just fine.
>>>
>>> If it used to work for years, how can you possibly argue that it was
>>> broken all along?
>>
>> Fully understood, and we fully agree on the principle.
>>
>> My hope is that João/Francesco can investigate and find the root cause, and
>> we can find a solution that works for both cases in a short time (say, this
>> week).
> 
> This week it will not happen, unfortunately :-(. I have no time to look
> into it and João has no longer access to the hardware.
> 
>> Without that we'd of course need to revert, but the next minute we'd still
>> need to find a solution to make error management work in the nominal case,
>> and I suspect we may end up with an ugly "works-without-pll" quirk and keep
>> it forever.
> 
> So, it seems that the actual DSI clock is the root cause, and from some
> check yesterday it's not possible to fix it (limitation on the clock
> generation). In practice the display is working fine with the PLL not
> locked (quite some people is using it without any issue).

I might be mistaken, but I don't think the PLL will work if unlocked...
But maybe the case is that it unlocks and lock again right afterwards.
>> João, Francesco, on what hardware do you observe the problem? Which SoC?
>> Which encoder, any previous bridges?
> 
> Verdin AM62, TI AM62 SOC, arch/arm64/boot/dts/ti/k3-am62-verdin.dtsi
> 
> There is a DPI to DSI bridge in the module, tc358778, it has a 25MHz
> reference clock.
> 
> TI AM62 DPI -> Toshiba TC358768 DSI -> TI SN65DSI83 -> Display
> 
> From a preliminary investigation this is a HW limitation, we are not
> able to generate a "good enough" DSI clock, see tc358768_calc_pll() for

I haven't studied the docs or done any testing, but I would think that
it doesn't matter for the PLL even if the incoming DSI clock is a bit
off, as long as it's continuous and stable.

My first thought was that the DSI is using non-continuous clock, but at
least the driver has code to drop the MIPI_DSI_CLOCK_NON_CONTINUOUS flag.

> the actual code implementation of it, I believe that the datasheet is
> not available without NDA.
> 
> Maybe the ugly hack "works-without-pll" is the way to work? It will
> require a DT change, but this seems doable.

Revert is easier than adding new hacky DT properties... At least until
the problem is understood.

> Please note that this is the outcome of a short investigation done
> yesterday afternoon, so maybe I am overlooking something, unfortunately
> I do not have the bandwidth to work on it more this week.
> 
>> Which clock rates?
> 71100000
It would be a good test to try out with a few different clocks.

 Tomi


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ