lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 24 Apr 2024 08:48:43 +0200
From: Boris Brezillon <boris.brezillon@...labora.com>
To: Adrián Larumbe <adrian.larumbe@...labora.com>
Cc: Steven Price <steven.price@....com>, Liviu Dudau <liviu.dudau@....com>,
 Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>, Maxime Ripard
 <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>, David Airlie
 <airlied@...il.com>, Daniel Vetter <daniel@...ll.ch>, kernel@...labora.com,
 dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/3] drm/panthor: introduce job cycle and timestamp
 accounting

On Tue, 23 Apr 2024 22:32:34 +0100
Adrián Larumbe <adrian.larumbe@...labora.com> wrote:

> Enable calculations of job submission times in clock cycles and wall
> time. This is done by expanding the boilerplate command stream when running
> a job to include instructions that compute said times right before an after
> a user CS.
> 
> Those numbers are stored in the queue's group's sync objects BO, right
> after them. Because the queues in a group might have a different number of
> slots, one must keep track of the overall slot tally when reckoning the
> offset of a queue's time sample structs, one for each slot.
> 
> NUM_INSTRS_PER_SLOT had to be increased to 32 because of adding new FW
> instructions for storing and subtracting the cycle counter and timestamp
> register, and it must always remain a power of two.
> 
> This commit is done in preparation for enabling DRM fdinfo support in the
> Panthor driver, which depends on the numbers calculated herein.
> 
> Signed-off-by: Adrián Larumbe <adrian.larumbe@...labora.com>
> ---
>  drivers/gpu/drm/panthor/panthor_sched.c | 158 ++++++++++++++++++++----
>  1 file changed, 134 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index b3a51a6de523..320dfa0388ba 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -93,6 +93,9 @@
>  #define MIN_CSGS				3
>  #define MAX_CSG_PRIO				0xf
>  
> +#define NUM_INSTRS_PER_SLOT			32
> +#define SLOTSIZE				(NUM_INSTRS_PER_SLOT * sizeof(u64))

Given everyone agreed on the profiling sysfs knob for Panfrost, I'm
tempted to make the profiling optional here as well, so we can save
space on the CS ring buffers when profiling is disabled. This means
adjusting the 'credits' parameter we pass to drm_sched_job_init()
accordingly, with one credit counting for an instruction (or a block of
16 instructions to keep things naturally cache-line aligned). You'll
also need to change the 'credit_limit' passed to drm_sched_init().
>  


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ