lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24184349.6Emhk5qWAg@workhorse>
Date: Tue, 09 Dec 2025 13:55:19 +0100
From: Nicolas Frattaroli <nicolas.frattaroli@...labora.com>
To: Boris Brezillon <boris.brezillon@...labora.com>,
 Steven Price <steven.price@....com>, Liviu Dudau <liviu.dudau@....com>,
 Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
 Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
 David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
 Karunika Choo <karunika.choo@....com>
Cc: kernel@...labora.com, linux-kernel@...r.kernel.org,
 dri-devel@...ts.freedesktop.org
Subject:
 Re: [PATCH 1/2] drm/panthor: Add tracepoint for hardware utilisation changes

On Monday, 8 December 2025 18:21:06 Central European Standard Time Karunika Choo wrote:
> On 03/12/2025 13:56, Nicolas Frattaroli wrote:
> > Mali GPUs have three registers that indicate which parts of the hardware
> > are powered and active at any moment. These take the form of bitmaps. In
> > the case of SHADER_PWRACTIVE for example, a high bit indicates that the
> > shader core corresponding to that bit index is active. These bitmaps
> > aren't solely contiguous bits, as it's common to have holes in the
> > sequence of shader core indices, and the actual set of which cores are
> > present is defined by the "shader present" register.
> > 
> > When the GPU finishes a power state transition, it fires a
> > GPU_IRQ_POWER_CHANGED_ALL interrupt. After such an interrupt is
> > received, the PWRACTIVE registers will likely contain interesting new
> > information.
> > 
> > This is not to be confused with the PWR_IRQ_POWER_CHANGED_ALL interrupt,
> > which is something related to Mali v14+'s power control logic. The
> > PWRACTIVE registers and corresponding interrupts are already available
> > in v9 and onwards.
> > 
> > Expose this as a tracepoint to userspace. This allows users to debug
> > various scenarios and gather interesting information, such as: knowing
> > how much hardware is lit up at any given time, correlating graphics
> > corruption with a specific active shader core, measuring when hardware
> > is allowed to go to an inactive state again, and so on.
> > 
> > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@...labora.com>
> > ---
> >  drivers/gpu/drm/panthor/panthor_device.c |  1 +
> >  drivers/gpu/drm/panthor/panthor_gpu.c    |  9 ++++++++
> >  drivers/gpu/drm/panthor/panthor_trace.h  | 38 ++++++++++++++++++++++++++++++++
> >  3 files changed, 48 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> > index e133b1e0ad6d..a3cb934104b8 100644
> > --- a/drivers/gpu/drm/panthor/panthor_device.c
> > +++ b/drivers/gpu/drm/panthor/panthor_device.c
> > @@ -548,6 +548,7 @@ int panthor_device_resume(struct device *dev)
> >  			    DRM_PANTHOR_USER_MMIO_OFFSET, 0, 1);
> >  	atomic_set(&ptdev->pm.state, PANTHOR_DEVICE_PM_STATE_ACTIVE);
> >  	mutex_unlock(&ptdev->pm.mmio_lock);
> > +
> >  	return 0;
> >  
> >  err_suspend_devfreq:
> > diff --git a/drivers/gpu/drm/panthor/panthor_gpu.c b/drivers/gpu/drm/panthor/panthor_gpu.c
> > index 9cb5dee93212..8830aa9a5c4b 100644
> > --- a/drivers/gpu/drm/panthor/panthor_gpu.c
> > +++ b/drivers/gpu/drm/panthor/panthor_gpu.c
> > @@ -22,6 +22,9 @@
> >  #include "panthor_hw.h"
> >  #include "panthor_regs.h"
> >  
> > +#define CREATE_TRACE_POINTS
> > +#include "panthor_trace.h"
> > +
> >  /**
> >   * struct panthor_gpu - GPU block management data.
> >   */
> > @@ -46,6 +49,7 @@ struct panthor_gpu {
> >  	(GPU_IRQ_FAULT | \
> >  	 GPU_IRQ_PROTM_FAULT | \
> >  	 GPU_IRQ_RESET_COMPLETED | \
> > +	 GPU_IRQ_POWER_CHANGED_ALL | \
> 
> Also, we've seen customers complain about too many IRQs originating
> from this event, is there any chance we can enable this conditionally
> i.e. only when the trace point is enabled?

Yeah, that's something I've been trying to look into. I'll need to
do some more digging to see if there's a way to run a callback when
an event tracepoint is enabled. That'd be the ideal way to do this,
because then we can just modify the interrupt mask in the callback.

For what it's worth, it doesn't fire very often for me, magnitudes
less often than the job interrupt fires at least. But I assume this
is highly implementation dependent, e.g. on bigger designs that have
more complex power setups and more reasons to enable only part of the
hardware, it'll fire way more often.

Kind regards,
Nicolas Frattaroli

> 
> Kind regards,
> Karunika




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ