[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aNUF6IDneKxjTP5t@e129842.arm.com>
Date: Thu, 25 Sep 2025 11:05:44 +0200
From: Marcin Ĺšlusarz <marcin.slusarz@....com>
To: Chia-I Wu <olvaffe@...il.com>
Cc: Boris Brezillon <boris.brezillon@...labora.com>,
Steven Price <steven.price@....com>,
Liviu Dudau <liviu.dudau@....com>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Maxime Ripard <mripard@...nel.org>,
Thomas Zimmermann <tzimmermann@...e.de>,
David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
Grant Likely <grant.likely@...aro.org>,
Heiko Stuebner <heiko@...ech.de>, dri-devel@...ts.freedesktop.org,
linux-kernel@...r.kernel.org, nd@....com,
Lukas Zapolskas <lukas.zapolskas@....com>
Subject: Re: [PATCH] drm/panthor: add query for calibrated timstamp info
Hi Chia-I,
On Tue, Sep 16, 2025 at 01:07:51PM -0700, Chia-I Wu wrote:
> DRM_PANTHOR_DEV_QUERY_CALIBRATED_TIMESTAMP_INFO provides a way to query
> and calibrate CPU and GPU timestamps.
I worked on a similar patch for Panthor, with a plan of submitting
it upstream soon, but with slightly different requirements, so maybe
we could merge both efforts in a single patch?
The first requirement was that it should be possible to get both CPU
and GPU timestamps, with the expectation that they should be taken as
close as possible (within 50us).
The second requirement was that it should be possible to also get
the value of GPU_CYCLE_COUNT register.
What I did is extend the existing DRM_PANTHOR_DEV_QUERY_TIMESTAMP_INFO
query in backward compatible manner with those new fields and obtaining
gpu and cpu timestamps with preemption and local irqs disabled (more on
that later).
Backward compatibility was achieved by adding new fields at the end of
struct drm_panthor_timestamp_info, and relying on the fact that if user
space passes smaller object it will be silently truncated.
Obtaining all kind of timing information with a single syscall might
be a bit too much, when user space might be interested only in some
data and not the complete view, so I'd propose this as a solution:
1) Extend existing query in backward compatible manner, by adding new
fields at the end.
2) Add flags, cpu timestamp, cycle count, and duration.
3) Flags would be:
DRM_PANTHOR_TIMESTAMP_GPU (1<<0)
DRM_PANTHOR_TIMESTAMP_CPU (1<<1)
DRM_PANTHOR_TIMESTAMP_OFFSET (1<<2)
DRM_PANTHOR_TIMESTAMP_FREQ (1<<3)
DRM_PANTHOR_TIMESTAMP_DURATION (1<<4)
DRM_PANTHOR_TIMESTAMP_SAME_TIME (1<<5)
DRM_PANTHOR_TIMESTAMP_CPU_MONOTONIC (0<<8)
DRM_PANTHOR_TIMESTAMP_CPU_MONOTONIC_RAW (1<<8)
DRM_PANTHOR_TIMESTAMP_CPU_REALTIME (2<<8)
DRM_PANTHOR_TIMESTAMP_CPU_BOOTTIME (3<<8)
DRM_PANTHOR_TIMESTAMP_CPU_TAI (4<<8)
and DRM_PANTHOR_TIMESTAMP_CPU_TYPE_MASK would be (7<<8).
If flags is 0 it would become
(DRM_PANTHOR_TIMESTAMP_GPU |
DRM_PANTHOR_TIMESTAMP_OFFSET |
DRM_PANTHOR_TIMESTAMP_FREQ)
For VK_KHR_calibrated_timestamps flags would be set as
(DRM_PANTHOR_TIMESTAMP_GPU |
DRM_PANTHOR_TIMESTAMP_CPU |
DRM_PANTHOR_TIMESTAMP_DURATION |
DRM_PANTHOR_TIMESTAMP_SAME_TIME |
(raw ? DRM_PANTHOR_TIMESTAMP_CPU_MONOTONIC_RAW : DRM_PANTHOR_TIMESTAMP_CPU_MONOTONIC))
4) The core of the functionality would query all required timing
information with preemption and irqs disabled iif SAME_TIME flag is set.
Probably we should exclude OFFSET and FREQ from that.
Why also interrupts disabled?
Recently we discovered that unrelated devices can raise interrupts for
so long that the assumption of timestamps being taken at the same time
completely breaks down (they are hundreds of microseconds apart).
What do you think?
Cheers,
Marcin
> This is needed because CPU and GPU timestamps are captured separately.
> The implementation makes an effort to minimize the capture duration,
> which is crucial for calibration and not exactly feasible from
> userspace.
>
> Signed-off-by: Chia-I Wu <olvaffe@...il.com>
>
> ---
> The query is inspired by xe's DRM_XE_DEVICE_QUERY_ENGINE_CYCLES and the
> naming is inspired by VK_KHR_calibrated_timestamps. The userspace change
> is https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37424.
> ---
> drivers/gpu/drm/panthor/panthor_drv.c | 88 ++++++++++++++++++++++++++-
> include/uapi/drm/panthor_drm.h | 31 ++++++++++
> 2 files changed, 118 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/panthor/panthor_drv.c b/drivers/gpu/drm/panthor/panthor_drv.c
> index fdbe89ef7f43c..06da6dcf016ef 100644
> --- a/drivers/gpu/drm/panthor/panthor_drv.c
> +++ b/drivers/gpu/drm/panthor/panthor_drv.c
> @@ -13,6 +13,7 @@
> #include <linux/pagemap.h>
> #include <linux/platform_device.h>
> #include <linux/pm_runtime.h>
> +#include <linux/sched/clock.h>
> #include <linux/time64.h>
>
> #include <drm/drm_auth.h>
> @@ -172,6 +173,7 @@ panthor_get_uobj_array(const struct drm_panthor_obj_array *in, u32 min_stride,
> PANTHOR_UOBJ_DECL(struct drm_panthor_csif_info, pad), \
> PANTHOR_UOBJ_DECL(struct drm_panthor_timestamp_info, current_timestamp), \
> PANTHOR_UOBJ_DECL(struct drm_panthor_group_priorities_info, pad), \
> + PANTHOR_UOBJ_DECL(struct drm_panthor_calibrated_timestamp_info, gpu_timestamp), \
> PANTHOR_UOBJ_DECL(struct drm_panthor_sync_op, timeline_value), \
> PANTHOR_UOBJ_DECL(struct drm_panthor_queue_submit, syncs), \
> PANTHOR_UOBJ_DECL(struct drm_panthor_queue_create, ringbuf_size), \
> @@ -779,6 +781,74 @@ static int panthor_query_timestamp_info(struct panthor_device *ptdev,
> return 0;
> }
>
> +static int panthor_query_calibrated_timestamp_info(
> + struct panthor_device *ptdev, const struct drm_panthor_calibrated_timestamp_info __user *in,
> + u32 in_size, struct drm_panthor_calibrated_timestamp_info *out)
> +{
> + /* cpu_clockid and pad take up the first 8 bytes */
> + const u32 min_size = 8;
> + u64 (*cpu_timestamp)(void);
> + int ret;
> +
> + if (in_size < min_size)
> + return -EINVAL;
> + if (!access_ok(in, min_size))
> + return -EFAULT;
> + ret = __get_user(out->cpu_clockid, &in->cpu_clockid);
> + if (ret)
> + return ret;
> + ret = __get_user(out->pad, &in->pad);
> + if (ret)
> + return ret;
> +
> + switch (out->cpu_clockid) {
> + case CLOCK_MONOTONIC:
> + cpu_timestamp = ktime_get_ns;
> + break;
> + case CLOCK_MONOTONIC_RAW:
> + cpu_timestamp = ktime_get_raw_ns;
> + break;
> + case CLOCK_REALTIME:
> + cpu_timestamp = ktime_get_real_ns;
> + break;
> + case CLOCK_BOOTTIME:
> + cpu_timestamp = ktime_get_boottime_ns;
> + break;
> + case CLOCK_TAI:
> + cpu_timestamp = ktime_get_clocktai_ns;
> + break;
> + default:
> + return -EINVAL;
> + }
> +
> + if (out->pad)
> + return -EINVAL;
> +
> + ret = panthor_device_resume_and_get(ptdev);
> + if (ret)
> + return ret;
> +
> + do {
> + const u32 hi = gpu_read(ptdev, GPU_TIMESTAMP + 4);
> +
> + /* keep duration minimal */
> + preempt_disable();
> + out->duration = local_clock();
> + out->cpu_timestamp = cpu_timestamp();
> + out->gpu_timestamp = gpu_read(ptdev, GPU_TIMESTAMP);
> + out->duration = local_clock() - out->duration;
> + preempt_enable();
> +
> + if (likely(hi == gpu_read(ptdev, GPU_TIMESTAMP + 4))) {
> + out->gpu_timestamp |= (u64)hi << 32;
> + break;
> + }
We don't need to loop on everything to read GPU_TIMESTAMP - using
gpu_read64_counter(ptdev, GPU_TIMESTAMP) is enough to guarantee correctness
(IOW why would we need to reread cpu timestamp if gpu timestamp wrapped
around?)
> + } while (true);
> +
> + pm_runtime_put(ptdev->base.dev);
> + return 0;
> +}
> +
> static int group_priority_permit(struct drm_file *file,
> u8 priority)
> {
> @@ -815,6 +885,7 @@ static int panthor_ioctl_dev_query(struct drm_device *ddev, void *data, struct d
> struct drm_panthor_dev_query *args = data;
> struct drm_panthor_timestamp_info timestamp_info;
> struct drm_panthor_group_priorities_info priorities_info;
> + struct drm_panthor_calibrated_timestamp_info calibrated_timestamp_info;
> int ret;
>
> if (!args->pointer) {
> @@ -835,6 +906,10 @@ static int panthor_ioctl_dev_query(struct drm_device *ddev, void *data, struct d
> args->size = sizeof(priorities_info);
> return 0;
>
> + case DRM_PANTHOR_DEV_QUERY_CALIBRATED_TIMESTAMP_INFO:
> + args->size = sizeof(calibrated_timestamp_info);
> + return 0;
> +
> default:
> return -EINVAL;
> }
> @@ -859,6 +934,16 @@ static int panthor_ioctl_dev_query(struct drm_device *ddev, void *data, struct d
> panthor_query_group_priorities_info(file, &priorities_info);
> return PANTHOR_UOBJ_SET(args->pointer, args->size, priorities_info);
>
> + case DRM_PANTHOR_DEV_QUERY_CALIBRATED_TIMESTAMP_INFO: {
> + ret = panthor_query_calibrated_timestamp_info(ptdev, u64_to_user_ptr(args->pointer),
> + args->size,
> + &calibrated_timestamp_info);
> + if (ret)
> + return ret;
> +
> + return PANTHOR_UOBJ_SET(args->pointer, args->size, calibrated_timestamp_info);
> + }
> +
> default:
> return -EINVAL;
> }
> @@ -1601,6 +1686,7 @@ static void panthor_debugfs_init(struct drm_minor *minor)
> * - 1.3 - adds DRM_PANTHOR_GROUP_STATE_INNOCENT flag
> * - 1.4 - adds DRM_IOCTL_PANTHOR_BO_SET_LABEL ioctl
> * - 1.5 - adds DRM_PANTHOR_SET_USER_MMIO_OFFSET ioctl
> + * - 1.6 - adds DRM_PANTHOR_DEV_QUERY_CALIBRATED_TIMESTAMP_INFO query
> */
> static const struct drm_driver panthor_drm_driver = {
> .driver_features = DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ |
> @@ -1614,7 +1700,7 @@ static const struct drm_driver panthor_drm_driver = {
> .name = "panthor",
> .desc = "Panthor DRM driver",
> .major = 1,
> - .minor = 5,
> + .minor = 6,
>
> .gem_create_object = panthor_gem_create_object,
> .gem_prime_import_sg_table = drm_gem_shmem_prime_import_sg_table,
> diff --git a/include/uapi/drm/panthor_drm.h b/include/uapi/drm/panthor_drm.h
> index 467d365ed7ba7..7f3ff43f17952 100644
> --- a/include/uapi/drm/panthor_drm.h
> +++ b/include/uapi/drm/panthor_drm.h
> @@ -243,6 +243,11 @@ enum drm_panthor_dev_query_type {
> * @DRM_PANTHOR_DEV_QUERY_GROUP_PRIORITIES_INFO: Query allowed group priorities information.
> */
> DRM_PANTHOR_DEV_QUERY_GROUP_PRIORITIES_INFO,
> +
> + /** @DRM_PANTHOR_DEV_QUERY_CALIBRATED_TIMESTAMP_INFO: Query calibrated
> + * timestamp information.
> + */
> + DRM_PANTHOR_DEV_QUERY_CALIBRATED_TIMESTAMP_INFO,
> };
>
> /**
> @@ -402,6 +407,32 @@ struct drm_panthor_group_priorities_info {
> __u8 pad[3];
> };
>
> +/**
> + * struct drm_panthor_calibrated_timestamp_info - Calibrated timestamp information
> + *
> + * Structure grouping all queryable information relating to the calibrated timestamp.
> + */
> +struct drm_panthor_calibrated_timestamp_info {
> + /** @clockid: The CPU clock id.
> + *
> + * Must be one of CLOCK_MONOTONIC, CLOCK_MONOTONIC_RAW,
> + * CLOCK_REALTIME, CLOCK_BOOTTIME, or CLOCK_TAI.
> + */
> + __s32 cpu_clockid;
> +
> + /** @pad: MBZ. */
> + __u32 pad;
> +
> + /** @duration: Duration for querying all timestamps in nanoseconds. */
> + __u64 duration;
> +
> + /** @cpu_timestamp: The current CPU timestamp in nanoseconds. */
> + __u64 cpu_timestamp;
> +
> + /** @gpu_timestamp: The current GPU timestamp in cycles. */
> + __u64 gpu_timestamp;
> +};
> +
> /**
> * struct drm_panthor_dev_query - Arguments passed to DRM_PANTHOR_IOCTL_DEV_QUERY
> */
> --
> 2.51.0.384.g4c02a37b29-goog
>
>
Powered by blists - more mailing lists