lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5534715.31r3eYUQgx@workhorse>
Date: Mon, 15 Dec 2025 20:13:39 +0100
From: Nicolas Frattaroli <nicolas.frattaroli@...labora.com>
To: Boris Brezillon <boris.brezillon@...labora.com>,
 Steven Price <steven.price@....com>, Liviu Dudau <liviu.dudau@....com>,
 Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
 Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
 David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
 Chia-I Wu <olvaffe@...il.com>, Karunika Choo <karunika.choo@....com>,
 Lukas Zapolskas <lukas.zapolskas@....com>
Cc: kernel@...labora.com, linux-kernel@...r.kernel.org,
 dri-devel@...ts.freedesktop.org
Subject:
 Re: [PATCH v3 2/3] drm/panthor: Add tracepoint for hardware utilisation
 changes

On Monday, 15 December 2025 18:21:52 Central European Standard Time Lukas Zapolskas wrote:
> Hello Nicolas,
> 
> 
> On 11/12/2025 16:15, Nicolas Frattaroli wrote:
> > Mali GPUs have three registers that indicate which parts of the hardware
> > are powered at any moment. These take the form of bitmaps. In the case
> > of SHADER_READY for example, a high bit indicates that the shader core
> > corresponding to that bit index is powered on. These bitmaps aren't
> > solely contiguous bits, as it's common to have holes in the sequence of
> > shader core indices, and the actual set of which cores are present is
> > defined by the "shader present" register.
> > 
> > When the GPU finishes a power state transition, it fires a
> > GPU_IRQ_POWER_CHANGED_ALL interrupt. After such an interrupt is
> > received, the _READY registers will contain new interesting data. During
> > power transitions, the GPU_IRQ_POWER_CHANGED interrupt will fire, and
> > the registers will likewise contain potentially changed data.
> > 
> > This is not to be confused with the PWR_IRQ_POWER_CHANGED_ALL interrupt,
> > which is something related to Mali v14+'s power control logic. The
> > _READY registers and corresponding interrupts are already available in
> > v9 and onwards.
> > 
> > Expose the data as a tracepoint to userspace. This allows users to debug
> > various scenarios and gather interesting information, such as: knowing
> > how much hardware is lit up at any given time, correlating graphics
> > corruption with a specific powered shader core, measuring when hardware
> > is allowed to go to a powered off state again, and so on.
> > 
> > The registration/unregistration functions for the tracepoint go through
> > a wrapper in panthor_hw.c, so that v14+ can implement the same
> > tracepoint by adding its hardware specific IRQ on/off callbacks to the
> > panthor_hw.ops member.
> > 
> > Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@...labora.com>
> > ---
> >  drivers/gpu/drm/panthor/panthor_gpu.c   | 38 ++++++++++++++++++--
> >  drivers/gpu/drm/panthor/panthor_gpu.h   |  2 ++
> >  drivers/gpu/drm/panthor/panthor_hw.c    | 62 +++++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/panthor/panthor_hw.h    |  8 +++++
> >  drivers/gpu/drm/panthor/panthor_trace.h | 59 +++++++++++++++++++++++++++++++
> >  5 files changed, 167 insertions(+), 2 deletions(-)
> > 
> > [... snip ...]
> > +/**
> > + * gpu_power_status - called whenever parts of GPU hardware are turned on or off
> > + * @dev: pointer to the &struct device, for printing the device name
> > + * @shader_bitmap: bitmap where a high bit indicates the shader core at a given
> > + *                 bit index is on, and a low bit indicates a shader core is
> > + *                 either powered off or absent
> > + * @tiler_bitmap: bitmap where a high bit indicates the tiler unit at a given
> > + *                bit index is on, and a low bit indicates a tiler unit is
> > + *                either powered off or absent
> > + * @l2_bitmap: bitmap where a high bit indicates the L2 cache at a given bit
> > + *             index is on, and a low bit indicates the L2 cache is either
> > + *             powered off or absent
> > + */
> > +TRACE_EVENT_FN(gpu_power_status,
> > +	TP_PROTO(const struct device *dev, u64 shader_bitmap, u64 tiler_bitmap,
> > +		 u64 l2_bitmap),
> > +	TP_ARGS(dev, shader_bitmap, tiler_bitmap, l2_bitmap),
> > +	TP_STRUCT__entry(
> > +		__string(dev_name, dev_name(dev))
> > +		__field(u64, shader_bitmap)
> > +		__field(u64, tiler_bitmap)
> > +		__field(u64, l2_bitmap)
> > +	),
> > +	TP_fast_assign(
> > +		__assign_str(dev_name);
> > +		__entry->shader_bitmap	= shader_bitmap;
> > +		__entry->tiler_bitmap	= tiler_bitmap;
> > +		__entry->l2_bitmap	= l2_bitmap;
> > +	),
> > +	TP_printk("%s: shader_bitmap=0x%llx tiler_bitmap=0x%llx l2_bitmap=0x%llx",
> > +		  __get_str(dev_name), __entry->shader_bitmap, __entry->tiler_bitmap,
> > +		  __entry->l2_bitmap
> > +	),
> > +	panthor_hw_power_status_register, panthor_hw_power_status_unregister
> > +);
> 
> What is the expectation of stability for this tracepoint? Because I worry about future architectures 
> that add different hardware blocks: we would have to either extend this tracepoint, or deprecate it
> and make another one that is very similar.

There is no problem with extending this tracepoint in the future.
Linux tracepoints automatically have a machine-readable description
of their data layout in tracefs, in the "format" file. Adding new
fields to the tracepoint will not interfere with any tooling that
uses the format description to parse the data. The tracepoint's ABI
is self-describing in that sense.

> Do you have any sort of userspace tooling that is consuming
> this or is this more for local debugging? 
 
The specific use-case is Perfetto. This tracepoint can be consumed
by Perfetto with no special parsing added to Perfetto itself, as
it can consume event tracepoints like this based on their "format"
description.

Perfetto can do manual parsing for some tracepoints to integrate
them into the timeline differently, but for instantaneous events
like this, that is not needed.

Kind regards,
Nicolas Frattaroli

> > +
> > +#endif /* __PANTHOR_TRACE_H__ */
> > +
> > +#undef TRACE_INCLUDE_PATH
> > +#define TRACE_INCLUDE_PATH .
> > +#undef TRACE_INCLUDE_FILE
> > +#define TRACE_INCLUDE_FILE panthor_trace
> > +
> > +#include <trace/define_trace.h>
> > 
> 
> Kind regards,
> Lukas
> 
> 





Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ