lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADL8D3ZiGA+XnyvyFCQbcK_SCffrfbhMXFpWzxWVjhuOFeu-Yg@mail.gmail.com>
Date: Mon, 21 Oct 2024 10:46:12 -0400
From: Jon Cormier <jcormier@...ticallink.com>
To: Tomi Valkeinen <tomi.valkeinen@...asonboard.com>
Cc: Devarsh Thakkar <devarsht@...com>, jyri.sarha@....fi, airlied@...il.com, 
	maarten.lankhorst@...ux.intel.com, mripard@...nel.org, tzimmermann@...e.de, 
	dri-devel@...ts.freedesktop.org, simona@...ll.ch, 
	linux-kernel@...r.kernel.org, praneeth@...com, vigneshr@...com, 
	aradhya.bhatia@...ux.dev, s-jain1@...com, r-donadkar@...com, sam@...nborg.org, 
	bparrot@...com
Subject: Re: [PATCH 1/2] drm/tidss: Clear the interrupt status for interrupts
 being disabled

Adding the e2e thread that has instigated this change.

https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1394222/am625-issue-about-tidss-rcu_preempt-self-detected-stall-on-cpu?pifragment-323307=1#pifragment-323307=2

Summary of original problem: An AM62x device using the TIDSS driver,
can lock up after hours of running.  The lock ups are often detected
by the rcu_preempt system.  The lock ups turned out to be caused by an
infinite interrupt loop (irq storm?) in the TIDSS_DISPC driver.

The k3_clear_irqstatus function which is responsible for clearing the
interrupt bits, only clear the the level 1 interrupts if the level 2
ones are set.  This leaves a small window where if for whatever reason
the level 2 interrupts aren't set but the level 1's are, then we will
never clear the level 1 interrupt.

The change as submitted is not sufficient to prevent the irq storm.
I've tested these two patches for several weeks now and they reduce
the frequency of the irq storm from once a day to once every few days,
but don't prevent it.

I suggest that the k3_clear_irqstatus function needs to be updated
such that it's not possible for the level 1 DISPC_IRQSTATUS bit to
remain uncleared.

The following hack proposed by Bin and team removes the possibility of
the irq storm happening, while introducing a small chance of clearing
interrupts that weren't intended.  Though I would assume that if the
level 2 interrupts aren't cleared, they would reassert the level 1
DISPC_IRQSTATUS so maybe that's not much of a risk.  Most other
drivers when clearing interrupts do a read and then write to clear
interrupts so there is precedence.

diff --git a/drivers/gpu/drm/tidss/tidss_dispc.c
b/drivers/gpu/drm/tidss/tidss_dispc.c
index 60f69be36692..0b8a3d999c54 100644
--- a/drivers/gpu/drm/tidss/tidss_dispc.c
+++ b/drivers/gpu/drm/tidss/tidss_dispc.c
@@ -900,27 +900,27 @@ static
 void dispc_k3_clear_irqstatus(struct dispc_device *dispc, dispc_irq_t
clearmask)
 {
     unsigned int i;
-    u32 top_clear = 0;

     for (i = 0; i < dispc->feat->num_vps; ++i) {
         if (clearmask & DSS_IRQ_VP_MASK(i)) {
             dispc_k3_vp_write_irqstatus(dispc, i, clearmask);
-            top_clear |= BIT(i);
         }
     }
+
     for (i = 0; i < dispc->feat->num_planes; ++i) {
         if (clearmask & DSS_IRQ_PLANE_MASK(i)) {
             dispc_k3_vid_write_irqstatus(dispc, i, clearmask);
-            top_clear |= BIT(4 + i);
         }
     }
+
     if (dispc->feat->subrev == DISPC_K2G)
         return;

-    dispc_write(dispc, DISPC_IRQSTATUS, top_clear);
-
-    /* Flush posted writes */
-    dispc_read(dispc, DISPC_IRQSTATUS);
+    /* Always clear the level 1 irqstatus (DISPC_IRQSTATUS) unconditionally
Note I'm not sure we are confident in the reasoning outlined in this comment
+     * due to an IP bug where level 1 irq status (DISPC_IRQSTATUS)
would get set delayed even
+     * after level 2 interrupt (DISPC_VID_IRQSTATUS,
DISPC_VP_IRQSTATUS) is cleared.
+     */
+    dispc_write(dispc, DISPC_IRQSTATUS, dispc_read(dispc, DISPC_IRQSTATUS));

I had proposed a more complete version of this patch but there hasn't
been much discussion about it and I've mostly tested Bins version.

 }


On Mon, Oct 21, 2024 at 7:15 AM Tomi Valkeinen
<tomi.valkeinen@...asonboard.com> wrote:
>
> Hi,
>
> On 12/10/2024 18:07, Devarsh Thakkar wrote:
> > It is possible that dispc_{k2g/k3}_set_irqenable can be called for
> > disabling some interrupt events which were previously enabled. However
> > instead of clearing any pending events for the interrupt events that are
> > required to be disabled, it was instead clearing the new interrupt events
> > which were not even enabled.
>
> That's on purpose. When we enable a new interrupt, we want to first
> clear the irqstatus for that interrupt to make sure there's no old
> status left lying around. If I'm not mistaken, enabling an interrupt
> with an irqstatus bit set will immediately trigger the interrupt.
>
> > For e.g. While disabling the vsync events, dispc_k3_set_irqenable tries to
> > clear DSS_IRQ_DEVICE_OCP_ERR which was not enabled per the old_mask at all
> > as shown below :
> >
> > "dispc_k3_set_irqenable : irqenabled - mask = 91, old = f0, clr = 1" where
> > clr = (mask ^ old_mask) & old_mask
>
> That's a bit odd... Why was the DSS_IRQ_DEVICE_OCP_ERR not already
> enabled? It is enabled in the tidss_irq_install().
>
> Or maybe it had been enabled by the driver, but as the HW doesn't
> support that bit, it reads always as 0. I have an unsent patch to drop
> DSS_IRQ_DEVICE_OCP_ERR.
>
> > This corrects the bit mask to make sure that it always clears any pending
> > interrupt events that are requested to be disabled before disabling them
> > actually.
>
> I think the point here makes sense: if we disable interrupts with
> dispc_set_irqenable(), we don't want to see interrupt handling for the
> disabled interrupts after the call.
>
> However, if you clear the irqstatus for an interrupt that will be
> disabled, but clear it _before_ disabling the interrupt, the interrupt
> might trigger right after clearing the irqstatus but before disabling it.
>
>   Tomi
>
> > Fixes: 32a1795f57ee ("drm/tidss: New driver for TI Keystone platform Display SubSystem")
> > Reported-by: Jonathan Cormier <jcormier@...ticallink.com>
> > Signed-off-by: Devarsh Thakkar <devarsht@...com>
> > ---
> >   drivers/gpu/drm/tidss/tidss_dispc.c | 8 ++++----
> >   1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/tidss/tidss_dispc.c b/drivers/gpu/drm/tidss/tidss_dispc.c
> > index 1ad711f8d2a8..b04419b24863 100644
> > --- a/drivers/gpu/drm/tidss/tidss_dispc.c
> > +++ b/drivers/gpu/drm/tidss/tidss_dispc.c
> > @@ -700,8 +700,8 @@ void dispc_k2g_set_irqenable(struct dispc_device *dispc, dispc_irq_t mask)
> >   {
> >       dispc_irq_t old_mask = dispc_k2g_read_irqenable(dispc);
> >
> > -     /* clear the irqstatus for newly enabled irqs */
> > -     dispc_k2g_clear_irqstatus(dispc, (mask ^ old_mask) & mask);
> > +     /* clear the irqstatus for irqs that are being disabled now */
> > +     dispc_k2g_clear_irqstatus(dispc, (mask ^ old_mask) & old_mask);
> >
> >       dispc_k2g_vp_set_irqenable(dispc, 0, mask);
> >       dispc_k2g_vid_set_irqenable(dispc, 0, mask);
> > @@ -843,8 +843,8 @@ static void dispc_k3_set_irqenable(struct dispc_device *dispc,
> >
> >       old_mask = dispc_k3_read_irqenable(dispc);
> >
> > -     /* clear the irqstatus for newly enabled irqs */
> > -     dispc_k3_clear_irqstatus(dispc, (old_mask ^ mask) & mask);
> > +     /* clear the irqstatus for irqs that are being disabled now */
> > +     dispc_k3_clear_irqstatus(dispc, (old_mask ^ mask) & old_mask);
> >
> >       for (i = 0; i < dispc->feat->num_vps; ++i) {
> >               dispc_k3_vp_set_irqenable(dispc, i, mask);
>


--
Jonathan Cormier
Software Engineer

Voice:  315.425.4045 x222



http://www.CriticalLink.com
6712 Brooklawn Parkway, Syracuse, NY 13211

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ