[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251218145621.50d371cc@fedora>
Date: Thu, 18 Dec 2025 14:56:21 +0100
From: Boris Brezillon <boris.brezillon@...labora.com>
To: Nicolas Frattaroli <nicolas.frattaroli@...labora.com>
Cc: Steven Price <steven.price@....com>, Liviu Dudau <liviu.dudau@....com>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>, Maxime Ripard
<mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>, David Airlie
<airlied@...il.com>, Simona Vetter <simona@...ll.ch>, Chia-I Wu
<olvaffe@...il.com>, Karunika Choo <karunika.choo@....com>,
kernel@...labora.com, linux-kernel@...r.kernel.org,
dri-devel@...ts.freedesktop.org
Subject: Re: [PATCH v4 1/3] drm/panthor: Add panthor_*_irq_mask_set helper
On Thu, 18 Dec 2025 14:42:42 +0100
Nicolas Frattaroli <nicolas.frattaroli@...labora.com> wrote:
> On Thursday, 18 December 2025 13:38:25 Central European Standard Time Boris Brezillon wrote:
> > On Wed, 17 Dec 2025 15:29:38 +0100
> > Nicolas Frattaroli <nicolas.frattaroli@...labora.com> wrote:
> >
> > > Add a function to modify an IRQ's mask. If the IRQ is currently active,
> > > it will write to the register, otherwise it will only set the struct
> > > member.
> > >
> > > There's no locking done to guarantee exclusion with the other two
> > > functions that touch the IRQ mask, and it should only be called from a
> > > context where the circumstances guarantee no concurrent access is
> > > performed.
> > >
> > > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@...labora.com>
> > > ---
> > > drivers/gpu/drm/panthor/panthor_device.h | 7 +++++++
> > > 1 file changed, 7 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> > > index f35e52b9546a..894d28b3eb02 100644
> > > --- a/drivers/gpu/drm/panthor/panthor_device.h
> > > +++ b/drivers/gpu/drm/panthor/panthor_device.h
> > > @@ -470,6 +470,13 @@ static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev, \
> > > panthor_ ## __name ## _irq_threaded_handler, \
> > > IRQF_SHARED, KBUILD_MODNAME "-" # __name, \
> > > pirq); \
> > > +} \
> > > + \
> > > +static inline void panthor_ ## __name ## _irq_mask_set(struct panthor_irq *pirq, u32 mask) \
> > > +{ \
> > > + pirq->mask = mask; \
> > > + if (!atomic_read(&pirq->suspended)) \
> > > + gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, mask); \
> >
> > This is racy if called outside the (threaded) IRQ handler, which I
> > believe is the case since its called when a trace point is enabled:
> >
> > 1. _irq_raw_handler() sets INT_MASK to zero to avoid receiving
> > interrupts from this IRQ line until the threaded handler has processed
> > events
> >
> > 2. _irq_mask_set() sets INT_MASK to something non-zero, meaning the
> > interrupt will be re-enabled before events have been processed
>
> Good catch, I only considered the case where the irq mask is modified
> by a PM runtime suspend/resume, not by the actual handler functions
> itself.
>
> > this leads to at least one spurious interrupt being received before we
> > set INT_MASK to zero again. Probably not the end of the world, but if
> > we can avoid it, that'd be better.
> >
> > Also, I'd like to see if we could re-purpose panthor_irq::mask to be
> > the mask of events the user wants to monitor instead of a pure proxy of
> > INT_MASK. If we do that, and we make panthor_irq::mask an atomic_t,
>
> Yeah, I was wondering why panthor_irq::mask changed on IRQ suspend
> and resume, since that information is already stored in the
> "suspended" atomic.
That's actually done on purpose, to avoid the threaded handler from
re-enabling the interrupts if it's still running when irq_suspend() is
called, since suspended it set to true only after the synchronize_irq()
(there a reason for that, but I don't remember it :D).
Now, if we re-purpose the suspended into a multi-state value, we can go:
SUSPENDED => IRQ is suspended/disabled
IDLE => active state, but no IRQ being processed
PROCESSING => active state, the threaded handler is on its way
SUSPENDING => about to be suspended (should be set to that at
the beginning of irq_suspend()
>
> > we can add panthor_xxx_irq_{enable,disable}_event() helpers that would
> > do the atomic_{or,and} on panthor_irq::mask, and write the new value
> > to _INT_MASK if:
> > - we're processing events in the threaded handler (we would need
> > another field, or we'd need to turn suspended into a state that can
> > encode more than just "suspended or not")
>
> Right, I assume synchronize_irq will not really fix the race,
> since a single synchronization point does not a mutually exclusive
> section make.
Correct.
>
> I thought about suspending IRQs, synchronising, then changing the
> mask, then re-enabling them, but that feels potentially disruptive.
Yeah, that's a bit heavy, especially since most of the time we're going
to hit the case where the update can go in directly without impacting
the rest of the driver.
> Though I may be misunderstanding the hardware here; if a bit is
> disabled in the INT_MASK and the hardware would want to fire such
> an interrupt, does it discard it? Because in that case, any solution
> where we suspend IRQs -> wait for threaded handler to finish ->
> modify mask -> resume IRQs could lead to us missing out on critical
> knowledge, like that a GPU fault occurred.
That too.
>
> In a perfect world, this hardware would use a register access
> pattern that allows for race-free updates of non-overlapping bits,
> like FIELD_PREP_WM16 implements. ;) But we do not live in such a
> world.
Actually, it does support that for STATUS (there's a separate CLEAR
register), but the MASK is not something you're supposed to modify
concurrently at a high rate, so it's not surprising we don't have the
same for INT_MASK. Also, atomic updates is something we can ensure at
a higher level, and the write to INT_MASK itself is atomic.
Powered by blists - more mailing lists