[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1aadcb3d-75e2-285c-2244-e472cc21bb97@quicinc.com>
Date: Wed, 14 Aug 2024 12:49:40 -0600
From: Jeffrey Hugo <quic_jhugo@...cinc.com>
To: Lizhi Hou <lizhi.hou@....com>, <ogabbay@...nel.org>,
<dri-devel@...ts.freedesktop.org>
CC: <linux-kernel@...r.kernel.org>, <min.ma@....com>, <max.zhen@....com>,
<sonal.santan@....com>, <king.tam@....com>
Subject: Re: [PATCH V2 00/10] AMD XDNA driver
On 8/12/2024 12:16 PM, Lizhi Hou wrote:
>
> On 8/9/24 08:21, Jeffrey Hugo wrote:
>> On 8/5/2024 11:39 AM, Lizhi Hou wrote:
>>> This patchset introduces a new Linux Kernel Driver, amdxdna for AMD
>>> NPUs.
>>> The driver is based on Linux accel subsystem.
>>>
>>> NPU (Neural Processing Unit) is an AI inference accelerator integrated
>>> into AMD client CPUs. NPU enables efficient execution of Machine
>>> Learning
>>> applications like CNNs, LLMs, etc. NPU is based on AMD XDNA
>>> architecture [1].
>>>
>>> AMD NPU consists of the following components:
>>>
>>> - Tiled array of AMD AI Engine processors.
>>> - Micro Controller which runs the NPU Firmware responsible for
>>> command processing, AIE array configuration, and execution
>>> management.
>>> - PCI EP for host control of the NPU device.
>>> - Interconnect for connecting the NPU components together.
>>> - SRAM for use by the NPU Firmware.
>>> - Address translation hardware for protected host memory access by
>>> the
>>> NPU.
>>>
>>> NPU supports multiple concurrent fully isolated contexts. Concurrent
>>> contexts may be bound to AI Engine array spatially and or temporarily.
>>>
>>> The driver is licensed under GPL-2.0 except for UAPI header which is
>>> licensed GPL-2.0 WITH Linux-syscall-note.
>>>
>>> User mode driver stack consists of XRT [2] and AMD AIE Plugin for
>>> IREE [3].
>>
>> Is there a special branch with the code? I don't see any of the uAPI
>> in either project when searching for the ioctl codes or ioctl structures.
>
> Please see git repo: https://github.com/amd/xdna-driver
>
> This contains the out tree driver and shim code which interact with
> driver. E.g.
>
> https://github.com/amd/xdna-driver/blob/main/src/shim/bo.cpp#L18
Ok, I need to have a look at this. Long term is the plan to move the
shim to the XRT repo once the driver is merged upstream?
>
>>
>>>
>>> The firmware for the NPU is distributed as a closed source binary,
>>> and has
>>> already been pushed to the DRM firmware repository [4].
>>>
>>> [1] https://www.amd.com/en/technologies/xdna.html
>>> [2] https://github.com/Xilinx/XRT
>>> [3] https://github.com/nod-ai/iree-amd-aie
>>> [4]
>>> https://gitlab.freedesktop.org/drm/firmware/-/tree/amd-ipu-staging/amdnpu
>>>
>>>
>>> Changes since v1:
>>> - Remove some inline defines
>>> - Minor changes based code review comments
>>>
>>> Lizhi Hou (10):
>>> accel/amdxdna: Add a new driver for AMD AI Engine
>>> accel/amdxdna: Support hardware mailbox
>>> accel/amdxdna: Add hardware resource solver
>>> accel/amdxdna: Add hardware context
>>> accel/amdxdna: Add GEM buffer object management
>>> accel/amdxdna: Add command execution
>>> accel/amdxdna: Add suspend and resume
>>> accel/amdxdna: Add error handling
>>> accel/amdxdna: Add query functions
>>> accel/amdxdna: Add firmware debug buffer support
>>>
>>> MAINTAINERS | 9 +
>>> drivers/accel/Kconfig | 1 +
>>> drivers/accel/Makefile | 1 +
>>> drivers/accel/amdxdna/Kconfig | 15 +
>>> drivers/accel/amdxdna/Makefile | 22 +
>>> drivers/accel/amdxdna/TODO | 4 +
>>> drivers/accel/amdxdna/aie2_ctx.c | 949 ++++++++++++++++++
>>> drivers/accel/amdxdna/aie2_error.c | 349 +++++++
>>> drivers/accel/amdxdna/aie2_message.c | 775 ++++++++++++++
>>> drivers/accel/amdxdna/aie2_msg_priv.h | 372 +++++++
>>> drivers/accel/amdxdna/aie2_pci.c | 756 ++++++++++++++
>>> drivers/accel/amdxdna/aie2_pci.h | 264 +++++
>>> drivers/accel/amdxdna/aie2_psp.c | 137 +++
>>> drivers/accel/amdxdna/aie2_smu.c | 112 +++
>>> drivers/accel/amdxdna/aie2_solver.c | 329 ++++++
>>> drivers/accel/amdxdna/aie2_solver.h | 156 +++
>>> drivers/accel/amdxdna/amdxdna_ctx.c | 597 +++++++++++
>>> drivers/accel/amdxdna/amdxdna_ctx.h | 165 +++
>>> drivers/accel/amdxdna/amdxdna_drm.c | 172 ++++
>>> drivers/accel/amdxdna/amdxdna_drm.h | 114 +++
>>> drivers/accel/amdxdna/amdxdna_gem.c | 700 +++++++++++++
>>> drivers/accel/amdxdna/amdxdna_gem.h | 73 ++
>>> drivers/accel/amdxdna/amdxdna_mailbox.c | 582 +++++++++++
>>> drivers/accel/amdxdna/amdxdna_mailbox.h | 124 +++
>>> .../accel/amdxdna/amdxdna_mailbox_helper.c | 50 +
>>> .../accel/amdxdna/amdxdna_mailbox_helper.h | 43 +
>>> drivers/accel/amdxdna/amdxdna_pci_drv.c | 234 +++++
>>> drivers/accel/amdxdna/amdxdna_pci_drv.h | 31 +
>>> drivers/accel/amdxdna/amdxdna_sysfs.c | 58 ++
>>> drivers/accel/amdxdna/npu1_regs.c | 94 ++
>>> drivers/accel/amdxdna/npu2_regs.c | 111 ++
>>> drivers/accel/amdxdna/npu4_regs.c | 111 ++
>>> drivers/accel/amdxdna/npu5_regs.c | 111 ++
>>> include/trace/events/amdxdna.h | 101 ++
>>> include/uapi/drm/amdxdna_accel.h | 456 +++++++++
>>> 35 files changed, 8178 insertions(+)
>>> create mode 100644 drivers/accel/amdxdna/Kconfig
>>> create mode 100644 drivers/accel/amdxdna/Makefile
>>> create mode 100644 drivers/accel/amdxdna/TODO
>>> create mode 100644 drivers/accel/amdxdna/aie2_ctx.c
>>> create mode 100644 drivers/accel/amdxdna/aie2_error.c
>>> create mode 100644 drivers/accel/amdxdna/aie2_message.c
>>> create mode 100644 drivers/accel/amdxdna/aie2_msg_priv.h
>>> create mode 100644 drivers/accel/amdxdna/aie2_pci.c
>>> create mode 100644 drivers/accel/amdxdna/aie2_pci.h
>>> create mode 100644 drivers/accel/amdxdna/aie2_psp.c
>>> create mode 100644 drivers/accel/amdxdna/aie2_smu.c
>>> create mode 100644 drivers/accel/amdxdna/aie2_solver.c
>>> create mode 100644 drivers/accel/amdxdna/aie2_solver.h
>>> create mode 100644 drivers/accel/amdxdna/amdxdna_ctx.c
>>> create mode 100644 drivers/accel/amdxdna/amdxdna_ctx.h
>>> create mode 100644 drivers/accel/amdxdna/amdxdna_drm.c
>>> create mode 100644 drivers/accel/amdxdna/amdxdna_drm.h
>>> create mode 100644 drivers/accel/amdxdna/amdxdna_gem.c
>>> create mode 100644 drivers/accel/amdxdna/amdxdna_gem.h
>>> create mode 100644 drivers/accel/amdxdna/amdxdna_mailbox.c
>>> create mode 100644 drivers/accel/amdxdna/amdxdna_mailbox.h
>>> create mode 100644 drivers/accel/amdxdna/amdxdna_mailbox_helper.c
>>> create mode 100644 drivers/accel/amdxdna/amdxdna_mailbox_helper.h
>>> create mode 100644 drivers/accel/amdxdna/amdxdna_pci_drv.c
>>> create mode 100644 drivers/accel/amdxdna/amdxdna_pci_drv.h
>>> create mode 100644 drivers/accel/amdxdna/amdxdna_sysfs.c
>>> create mode 100644 drivers/accel/amdxdna/npu1_regs.c
>>> create mode 100644 drivers/accel/amdxdna/npu2_regs.c
>>> create mode 100644 drivers/accel/amdxdna/npu4_regs.c
>>> create mode 100644 drivers/accel/amdxdna/npu5_regs.c
>>> create mode 100644 include/trace/events/amdxdna.h
>>> create mode 100644 include/uapi/drm/amdxdna_accel.h
>>>
>>
>> No Documentation?
>
> Is it ok to add a work item to TODO and add documentation in later patches?
I beleive best practice would be to add Documnetation in the same
patch/series that adds the functionality. I'm not expecting
Documentation for items not implemented in this series, however I think
describing the product/architecture/other high level topics would help
put the code in context during review.
It does seem like the AMD GPU driver had a lot of documentation, which
makes the lack of documentation for the AMD Accel driver particularly odd.
Powered by blists - more mailing lists