lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3267470.irdbgypaU6@workhorse>
Date: Thu, 18 Dec 2025 14:42:42 +0100
From: Nicolas Frattaroli <nicolas.frattaroli@...labora.com>
To: Boris Brezillon <boris.brezillon@...labora.com>
Cc: Steven Price <steven.price@....com>, Liviu Dudau <liviu.dudau@....com>,
 Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
 Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
 David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
 Chia-I Wu <olvaffe@...il.com>, Karunika Choo <karunika.choo@....com>,
 kernel@...labora.com, linux-kernel@...r.kernel.org,
 dri-devel@...ts.freedesktop.org
Subject: Re: [PATCH v4 1/3] drm/panthor: Add panthor_*_irq_mask_set helper

On Thursday, 18 December 2025 13:38:25 Central European Standard Time Boris Brezillon wrote:
> On Wed, 17 Dec 2025 15:29:38 +0100
> Nicolas Frattaroli <nicolas.frattaroli@...labora.com> wrote:
> 
> > Add a function to modify an IRQ's mask. If the IRQ is currently active,
> > it will write to the register, otherwise it will only set the struct
> > member.
> > 
> > There's no locking done to guarantee exclusion with the other two
> > functions that touch the IRQ mask, and it should only be called from a
> > context where the circumstances guarantee no concurrent access is
> > performed.
> > 
> > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@...labora.com>
> > ---
> >  drivers/gpu/drm/panthor/panthor_device.h | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> > index f35e52b9546a..894d28b3eb02 100644
> > --- a/drivers/gpu/drm/panthor/panthor_device.h
> > +++ b/drivers/gpu/drm/panthor/panthor_device.h
> > @@ -470,6 +470,13 @@ static int panthor_request_ ## __name ## _irq(struct panthor_device *ptdev,			\
> >  					 panthor_ ## __name ## _irq_threaded_handler,		\
> >  					 IRQF_SHARED, KBUILD_MODNAME "-" # __name,		\
> >  					 pirq);							\
> > +}												\
> > +												\
> > +static inline void panthor_ ## __name ## _irq_mask_set(struct panthor_irq *pirq, u32 mask)	\
> > +{												\
> > +	pirq->mask = mask;									\
> > +	if (!atomic_read(&pirq->suspended))							\
> > +		gpu_write(pirq->ptdev, __reg_prefix ## _INT_MASK, mask);			\
> 
> This is racy if called outside the (threaded) IRQ handler, which I
> believe is the case since its called when a trace point is enabled:
> 
> 1. _irq_raw_handler() sets INT_MASK to zero to avoid receiving
> interrupts from this IRQ line until the threaded handler has processed
> events
> 
> 2. _irq_mask_set() sets INT_MASK to something non-zero, meaning the
> interrupt will be re-enabled before events have been processed

Good catch, I only considered the case where the irq mask is modified
by a PM runtime suspend/resume, not by the actual handler functions
itself.

> this leads to at least one spurious interrupt being received before we
> set INT_MASK to zero again. Probably not the end of the world, but if
> we can avoid it, that'd be better.
> 
> Also, I'd like to see if we could re-purpose panthor_irq::mask to be
> the mask of events the user wants to monitor instead of a pure proxy of
> INT_MASK. If we do that, and we make panthor_irq::mask an atomic_t,

Yeah, I was wondering why panthor_irq::mask changed on IRQ suspend
and resume, since that information is already stored in the
"suspended" atomic.

> we can add panthor_xxx_irq_{enable,disable}_event() helpers that would
> do the atomic_{or,and} on panthor_irq::mask, and write the new value
> to _INT_MASK if:
> - we're processing events in the threaded handler (we would need
>   another field, or we'd need to turn suspended into a state that can
>   encode more than just "suspended or not")

Right, I assume synchronize_irq will not really fix the race,
since a single synchronization point does not a mutually exclusive
section make.

I thought about suspending IRQs, synchronising, then changing the
mask, then re-enabling them, but that feels potentially disruptive.
Though I may be misunderstanding the hardware here; if a bit is
disabled in the INT_MASK and the hardware would want to fire such
an interrupt, does it discard it? Because in that case, any solution
where we suspend IRQs -> wait for threaded handler to finish ->
modify mask -> resume IRQs could lead to us missing out on critical
knowledge, like that a GPU fault occurred.

In a perfect world, this hardware would use a register access
pattern that allows for race-free updates of non-overlapping bits,
like FIELD_PREP_WM16 implements. ;) But we do not live in such a
world.

> - the device is not suspended (that test you already have)
> 
> >  }
> >  
> >  extern struct workqueue_struct *panthor_cleanup_wq;
> > 
> 
> 

Thanks for the review, I'll try to come up with a solution that
won't potentially blow our legs off.

Kind regards,
Nicolas Frattaroli



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ