lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251119122443.GA29208@francesco-nb>
Date: Wed, 19 Nov 2025 13:24:57 +0100
From: Francesco Dolcini <francesco@...cini.it>
To: Tomi Valkeinen <tomi.valkeinen@...asonboard.com>
Cc: Francesco Dolcini <francesco@...cini.it>,
	Luca Ceresoli <luca.ceresoli@...tlin.com>,
	Herve Codina <herve.codina@...tlin.com>,
	Maxime Ripard <mripard@...nel.org>,
	João Paulo Gonçalves <jpaulo.silvagoncalves@...il.com>,
	Andrzej Hajda <andrzej.hajda@...el.com>,
	Neil Armstrong <neil.armstrong@...aro.org>,
	Robert Foss <rfoss@...nel.org>,
	Laurent Pinchart <Laurent.pinchart@...asonboard.com>,
	Jonas Karlman <jonas@...boo.se>,
	Jernej Skrabec <jernej.skrabec@...il.com>,
	Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
	Thomas Zimmermann <tzimmermann@...e.de>,
	David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
	João Paulo Gonçalves <joao.goncalves@...adex.com>,
	linux-kernel@...r.kernel.org, regressions@...ts.linux.dev,
	thomas.petazzoni@...tlin.com
Subject: Re: [REGRESSION] TI SN65DSI83 is being reset making display to blink
 On/Off

On Wed, Nov 19, 2025 at 02:09:04PM +0200, Tomi Valkeinen wrote:
> On 19/11/2025 13:12, Francesco Dolcini wrote:
> > On Wed, Nov 19, 2025 at 11:08:23AM +0100, Luca Ceresoli wrote:
> >>
> >> On Wed Nov 19, 2025 at 9:40 AM CET, Maxime Ripard wrote:
> >>> On Wed, Nov 19, 2025 at 08:51:27AM +0100, Herve Codina wrote:
> >>>> On Tue, 18 Nov 2025 17:56:36 +0100
> >>>> Maxime Ripard <mripard@...nel.org> wrote:
> >>>>> On Mon, Nov 17, 2025 at 04:27:28PM +0100, Luca Ceresoli wrote:
> >>>>>> On Thu Nov 13, 2025 at 10:19 AM CET, Francesco Dolcini wrote:
> >>>>>>> On Thu, Nov 13, 2025 at 08:49:50AM +0100, Herve Codina wrote:
> >>>>>>>> On Mon, 10 Nov 2025 16:03:51 -0300
> >>>>>>>> João Paulo Gonçalves <jpaulo.silvagoncalves@...il.com> wrote:
> >>>>>>>>> After commit ad5c6ecef27e ("drm: bridge: ti-sn65dsi83: Add error
> >>>>>>>>> recovery mechanism"), our DSI display stopped working correctly. The
> >>>>>>>>> display internally uses a TI SN65DSI83 to convert DSI-to-LVDS, and with
> >>>>>>>>> the change, it keeps blinking on and off because the bridge is being
> >>>>>>>>> reset by the error recovery mechanism.
> >>>>>>>>>
> >>>>>>>>> Even before the change, it was possible to see the message below from
> >>>>>>>>> the driver indicating that the bridge's internal PLL was not locked
> >>>>>>>>> (register 0xE5, bit 0 in [1]):
> >>>>>>>>>
> >>>>>>>>> [ 11.198616] sn65dsi83 2-002c: Unexpected link status 0x01
> >>>>>>>>>
> >>>>>>>>> However, it was working. After the change, it stopped working. Masking
> >>>>>>>>> the PLL error makes it work again:
> >>>>>>>>>
> >>>>>>>>> diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
> >>>>>>>>> index 033c44326552..89a0a2ab45b1 100644
> >>>>>>>>> --- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
> >>>>>>>>> +++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
> >>>>>>>>> @@ -429,7 +429,7 @@ static void sn65dsi83_handle_errors(struct sn65dsi83 *ctx)
> >>>>>>>>>          */
> >>>>>>>>>
> >>>>>>>>>         ret = regmap_read(ctx->regmap, REG_IRQ_STAT, &irq_stat);
> >>>>>>>>> -       if (ret || irq_stat) {
> >>>>>>>>> +       if (ret || (irq_stat & ~REG_IRQ_STAT_CHA_PLL_UNLOCK)) {
> >>>>>>>>>                 /*
> >>>>>>>>>                  * IRQ acknowledged is not always possible (the bridge can be in
> >>>>>>>>>                  * a state where it doesn't answer anymore). To prevent an
> >>>>>>>>>
> >>>>>>>>> Any suggestions on how to proceed here?
> >>>>>>>>>
> >>>>>>>>> #regzbot introduced: ad5c6ecef27e
> >>>>>>>>>
> >>>>>>>>> [1] https://www.ti.com/lit/ds/symlink/sn65dsi83.pdf
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> The PLL should be locked.
> >>>>>>>>
> >>>>>>>> Also in the datasheet, in 'Table 7-2. Initialization Sequence', the status
> >>>>>>>> is checked at the end of the initialization sequence and the sequence has to
> >>>>>>>> be done again when the status register value is not 0x00.
> >>>>>>>>
> >>>>>>>> Even before monitoring (irq or polling method), you have an issue with your PLL
> >>>>>>>> mentioned with the "sn65dsi83 2-002c: Unexpected link status 0x01" message.
> >>>>>>>>
> >>>>>>>> I don't understand even how your panel can be correctly driven with the bridge
> >>>>>>>> PLL unlock.
> >>>>>>>
> >>>>>>> We'll try to figure out the reason and see what's the best path forward.
> >>>>>>>
> >>>>>>> Whatever was the reason it was working before, and it should stay
> >>>>>>> working
> >>>>>>
> >>>>>> I agree with Hervé that using the chip with an unlocked PLL looks dangerous
> >>>>>> and totally out of spec. So I encourage you to investigate what is going on
> >>>>>> in the hardware looking for the root cause, checking whether the PLL is
> >>>>>> really unlocked and how to get it working properly.
> >>>>>>
> >>>>>> The driver should be definitely be written focusing on the nominal case and
> >>>>>> handle out-of-spec cases as an exception, not the other way around.
> >>>>>>
> >>>>>> I also agree your hardware should not stop working when upgrading to a new
> >>>>>> kernel, so this investigation would ideally nail down tha root cause and
> >>>>>> point to a solution in a very short time.
> >>>>>>
> >>>>>> Hervé has matured quite some experience on SN65DSI84 error management,
> >>>>>> leading to his error recovery patch, and I also have a board with that chip
> >>>>>> on my desk. So we may be helpful in discussion, as well as reviewing and
> >>>>>> testing patches.
> >>>>>
> >>>>> I'd say we should do it the other way around. If that patch breaks
> >>>>> systems that were working fine so far without a clear reason, we should
> >>>>> revert the offending commit, and *then* work towards a solution to
> >>>>> support error recovery that doesn't break that system.
> >>>>>
> >>>>
> >>>> I have the feeling that the broken system has an issue from the beginning.
> >>>> Why its PLL has been unlocked ?
> >>>>
> >>>> I would like to understand what happens but, of course, I don't have the
> >>>> hardware to investigate.
> >>>>
> >>>> Could the issue been on a component before the SN65DSI83 bridge?
> >>>> I mean the component in charge of generating the DSI clock can be a culprit.
> >>>
> >>> I understand what you're saying, but it's not the right way to think about it.
> >>>
> >>> Let's change perspective.
> >>>
> >>> Your work laptop just got a kernel upgrade, and its display doesn't work
> >>> anymore. Would you be happy with the answer "it was broken all along, we
> >>> might be able to help you fix it, maybe not, who knows when we'll have a
> >>> fix"?
> >>>
> >>> It's frustrating, you might not even be able to debug it in the first
> >>> place, and most importantly, broken or not, it used to work just fine.
> >>>
> >>> If it used to work for years, how can you possibly argue that it was
> >>> broken all along?
> >>
> >> Fully understood, and we fully agree on the principle.
> >>
> >> My hope is that João/Francesco can investigate and find the root cause, and
> >> we can find a solution that works for both cases in a short time (say, this
> >> week).
> > 
> > This week it will not happen, unfortunately :-(. I have no time to look
> > into it and João has no longer access to the hardware.
> > 
> >> Without that we'd of course need to revert, but the next minute we'd still
> >> need to find a solution to make error management work in the nominal case,
> >> and I suspect we may end up with an ugly "works-without-pll" quirk and keep
> >> it forever.
> > 
> > So, it seems that the actual DSI clock is the root cause, and from some
> > check yesterday it's not possible to fix it (limitation on the clock
> > generation). In practice the display is working fine with the PLL not
> > locked (quite some people is using it without any issue).
> 
> I might be mistaken, but I don't think the PLL will work if unlocked...
> But maybe the case is that it unlocks and lock again right afterwards.
> >> João, Francesco, on what hardware do you observe the problem? Which SoC?
> >> Which encoder, any previous bridges?
> > 
> > Verdin AM62, TI AM62 SOC, arch/arm64/boot/dts/ti/k3-am62-verdin.dtsi
> > 
> > There is a DPI to DSI bridge in the module, tc358778, it has a 25MHz
> > reference clock.
> > 
> > TI AM62 DPI -> Toshiba TC358768 DSI -> TI SN65DSI83 -> Display
> > 
> > From a preliminary investigation this is a HW limitation, we are not
> > able to generate a "good enough" DSI clock, see tc358768_calc_pll() for
> 
> I haven't studied the docs or done any testing, but I would think that
> it doesn't matter for the PLL even if the incoming DSI clock is a bit
> off, as long as it's continuous and stable.
> 
> My first thought was that the DSI is using non-continuous clock, but at
> least the driver has code to drop the MIPI_DSI_CLOCK_NON_CONTINUOUS flag.
> 
> > the actual code implementation of it, I believe that the datasheet is
> > not available without NDA.
> > 
> > Maybe the ugly hack "works-without-pll" is the way to work? It will
> > require a DT change, but this seems doable.
> 
> Revert is easier than adding new hacky DT properties... At least until
> the problem is understood.
> 
> > Please note that this is the outcome of a short investigation done
> > yesterday afternoon, so maybe I am overlooking something, unfortunately
> > I do not have the bandwidth to work on it more this week.
> > 
> >> Which clock rates?
> > 71100000
> It would be a good test to try out with a few different clocks.

50 MHz works, for example.

It seems that the issue exists when the actual display clock is different
from the dsi clock. And this can happen for the reason I explained
before (the DSI clock is computed starting from this 25MHz reference
clock).

Francesco


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ