[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ad3edfe7-8dbf-b6a1-f51f-53ea600439c9@amd.com>
Date: Tue, 26 Apr 2022 10:29:44 +0200
From: Christian König <christian.koenig@....com>
To: Cai Huoqing <cai.huoqing@...ux.dev>
Cc: Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Maxime Ripard <mripard@...nel.org>,
Thomas Zimmermann <tzimmermann@...e.de>,
David Airlie <airlied@...ux.ie>,
Daniel Vetter <daniel@...ll.ch>,
Sumit Semwal <sumit.semwal@...aro.org>,
linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
linux-media@...r.kernel.org, linaro-mm-sig@...ts.linaro.org
Subject: Re: [PATCH v2 4/4] drm/nvdla/uapi: Add UAPI of NVDLA driver
Am 26.04.22 um 10:23 schrieb Cai Huoqing:
> On 26 4月 22 08:31:05, Christian König wrote:
>> Am 26.04.22 um 08:08 schrieb Cai Huoqing:
>>> The NVIDIA Deep Learning Accelerator (NVDLA) is an open source IP
>>> which is integrated into NVIDIA Jetson AGX Xavier,
>>> so add UAPI of this driver.
>>>
>>> Signed-off-by: Cai Huoqing <cai.huoqing@...ux.dev>
>>> ---
>>> v1->v2:
>>> *Rename nvdla_drm.[ch] to nvdla_drv.[ch] and rename nvdla_ioctl.h to nvdla_drm.h,
>>> move it to uapi.
>>> comments link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20bac605-97e6-e5cd-c4e4-83a8121645d8%40amd.com%2F&data=05%7C01%7Cchristian.koenig%40amd.com%7C0777513b15b34d20c30e08da275e235c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637865582541002548%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ziQSwKxqhevOLDxq%2FvgfinF8BG3hiAwmUxsH3ZzZF4E%3D&reserved=0
>>>
>>> include/uapi/drm/nvdla_drm.h | 99 ++++++++++++++++++++++++++++++++++++
>>> 1 file changed, 99 insertions(+)
>>> create mode 100644 include/uapi/drm/nvdla_drm.h
>>>
>>> diff --git a/include/uapi/drm/nvdla_drm.h b/include/uapi/drm/nvdla_drm.h
>>> new file mode 100644
>>> index 000000000000..984635285525
>>> --- /dev/null
>>> +++ b/include/uapi/drm/nvdla_drm.h
>>> @@ -0,0 +1,99 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause */
>>> +/*
>>> + * Copyright (C) 2017-2018 NVIDIA CORPORATION.
>>> + * Copyright (C) 2022 Cai Huoqing
>>> + */
>>> +
>>> +#ifndef __LINUX_NVDLA_IOCTL_H
>>> +#define __LINUX_NVDLA_IOCTL_H
>>> +
>>> +#include <linux/ioctl.h>
>>> +#include <linux/types.h>
>>> +
>>> +#if !defined(__KERNEL__)
>>> +#define __user
>>> +#endif
>>> +
>>> +/**
>>> + * struct nvdla_mem_handle structure for memory handles
>>> + *
>>> + * @handle handle to DMA buffer allocated in userspace
>>> + * @reserved Reserved for padding
>>> + * @offset offset in bytes from start address of buffer
>>> + *
>>> + */
>>> +struct nvdla_mem_handle {
>>> + __u32 handle;
>>> + __u32 reserved;
>>> + __u64 offset;
>>> +};
>>> +
>>> +/**
>>> + * struct nvdla_ioctl_submit_task structure for single task information
>>> + *
>>> + * @num_addresses total number of entries in address_list
>>> + * @reserved Reserved for padding
>>> + * @address_list pointer to array of struct nvdla_mem_handle
>>> + *
>>> + */
>>> +struct nvdla_ioctl_submit_task {
>>> +#define NVDLA_MAX_BUFFERS_PER_TASK (6144)
>>> + __u32 num_addresses;
>>> +#define NVDLA_NO_TIMEOUT (0xffffffff)
>>> + __u32 timeout;
>> What format does that timeout value have?
>>
>> In general it is best practice to have absolute 64bit nanosecond timeouts
>> (to be used with ktime inside the kernel) so that restarting interrupted
>> IOCTLs works smooth.
>>
>>> + __u64 address_list;
>> Maybe make the comments inline, cause I just wanted to write that you should
>> note that this is pointing to an nvdla_mem_handle array until I saw the
>> comment above.
>>
>>> +};
>>> +
>>> +/**
>>> + * struct nvdla_submit_args structure for task submit
>>> + *
>>> + * @tasks pointer to array of struct nvdla_ioctl_submit_task
>>> + * @num_tasks number of entries in tasks
>>> + * @flags flags for task submit, no flags defined yet
>>> + * @version version of task structure
>>> + *
>>> + */
>>> +struct nvdla_submit_args {
>>> + __u64 tasks;
>>> + __u16 num_tasks;
>>> +#define NVDLA_MAX_TASKS_PER_SUBMIT 24
>>> +#define NVDLA_SUBMIT_FLAGS_ATOMIC (1 << 0)
>> Well that "no flags defined yet" from the comment above is probably outdated
>> :)
>>
>> A comment what this flag means would also be nice to have.
>>
>> Apart from all those nit picks that looks pretty solid to me. Just one core
>> functionality we usually have seems to be missing here: How is completion
>> signaling implemented?
> Hi,thank for your reply.
>
> Do you mean fence signal? In this driver, IOCTL_SUBMIT is a synchronous call
> which do task submission & wait for done completion. This accerletor deal
> with massive compute operator (Pooling, Conv...), that is different to
> GPU. It's unnecessary to expose fence API to UMD for reducing such less time.
You should probably add that as a comment somewhere here.
Thanks,
Christian.
>
> Thanks,
> Cai
>> Regards,
>> Christian.
>>
>>> + __u16 flags;
>>> + __u32 version;
>>> +};
>>> +
>>> +/**
>>> + * struct nvdla_gem_create_args for allocating DMA buffer through GEM
>>> + *
>>> + * @handle handle updated by kernel after allocation
>>> + * @flags implementation specific flags
>>> + * @size size of buffer to allocate
>>> + */
>>> +struct nvdla_gem_create_args {
>>> + __u32 handle;
>>> + __u32 flags;
>>> + __u64 size;
>>> +};
>>> +
>>> +/**
>>> + * struct nvdla_gem_map_offset_args for mapping DMA buffer
>>> + *
>>> + * @handle handle of the buffer
>>> + * @reserved reserved for padding
>>> + * @offset offset updated by kernel after mapping
>>> + */
>>> +struct nvdla_gem_map_offset_args {
>>> + __u32 handle;
>>> + __u32 reserved;
>>> + __u64 offset;
>>> +};
>>> +
>>> +#define DRM_NVDLA_SUBMIT 0x00
>>> +#define DRM_NVDLA_GEM_CREATE 0x01
>>> +#define DRM_NVDLA_GEM_MMAP 0x02
>>> +
>>> +#define DRM_IOCTL_NVDLA_SUBMIT DRM_IOWR(DRM_COMMAND_BASE + DRM_NVDLA_SUBMIT, struct nvdla_submit_args)
>>> +#define DRM_IOCTL_NVDLA_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_NVDLA_GEM_CREATE, struct nvdla_gem_create_args)
>>> +#define DRM_IOCTL_NVDLA_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE + DRM_NVDLA_GEM_MMAP, struct nvdla_gem_map_offset_args)
>>> +
>>> +#endif
Powered by blists - more mailing lists