[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c0a66ae8-43ec-257f-92c5-6ecbfcd45c1a@amd.com>
Date: Wed, 14 Aug 2024 13:06:49 -0700
From: Lizhi Hou <lizhi.hou@....com>
To: Jeffrey Hugo <quic_jhugo@...cinc.com>, <ogabbay@...nel.org>,
<dri-devel@...ts.freedesktop.org>
CC: <linux-kernel@...r.kernel.org>, <min.ma@....com>, <max.zhen@....com>,
<sonal.santan@....com>, <king.tam@....com>
Subject: Re: [PATCH V2 00/10] AMD XDNA driver
On 8/14/24 11:49, Jeffrey Hugo wrote:
> On 8/12/2024 12:16 PM, Lizhi Hou wrote:
>>
>> On 8/9/24 08:21, Jeffrey Hugo wrote:
>>> On 8/5/2024 11:39 AM, Lizhi Hou wrote:
>>>> This patchset introduces a new Linux Kernel Driver, amdxdna for AMD
>>>> NPUs.
>>>> The driver is based on Linux accel subsystem.
>>>>
>>>> NPU (Neural Processing Unit) is an AI inference accelerator integrated
>>>> into AMD client CPUs. NPU enables efficient execution of Machine
>>>> Learning
>>>> applications like CNNs, LLMs, etc. NPU is based on AMD XDNA
>>>> architecture [1].
>>>>
>>>> AMD NPU consists of the following components:
>>>>
>>>> - Tiled array of AMD AI Engine processors.
>>>> - Micro Controller which runs the NPU Firmware responsible for
>>>> command processing, AIE array configuration, and execution
>>>> management.
>>>> - PCI EP for host control of the NPU device.
>>>> - Interconnect for connecting the NPU components together.
>>>> - SRAM for use by the NPU Firmware.
>>>> - Address translation hardware for protected host memory access
>>>> by the
>>>> NPU.
>>>>
>>>> NPU supports multiple concurrent fully isolated contexts. Concurrent
>>>> contexts may be bound to AI Engine array spatially and or temporarily.
>>>>
>>>> The driver is licensed under GPL-2.0 except for UAPI header which is
>>>> licensed GPL-2.0 WITH Linux-syscall-note.
>>>>
>>>> User mode driver stack consists of XRT [2] and AMD AIE Plugin for
>>>> IREE [3].
>>>
>>> Is there a special branch with the code? I don't see any of the
>>> uAPI in either project when searching for the ioctl codes or ioctl
>>> structures.
>>
>> Please see git repo: https://github.com/amd/xdna-driver
>>
>> This contains the out tree driver and shim code which interact with
>> driver. E.g.
>>
>> https://github.com/amd/xdna-driver/blob/main/src/shim/bo.cpp#L18
>
> Ok, I need to have a look at this. Long term is the plan to move the
> shim to the XRT repo once the driver is merged upstream?
Yes.
>
>>
>>>
>>>>
>>>> The firmware for the NPU is distributed as a closed source binary,
>>>> and has
>>>> already been pushed to the DRM firmware repository [4].
>>>>
>>>> [1] https://www.amd.com/en/technologies/xdna.html
>>>> [2] https://github.com/Xilinx/XRT
>>>> [3] https://github.com/nod-ai/iree-amd-aie
>>>> [4]
>>>> https://gitlab.freedesktop.org/drm/firmware/-/tree/amd-ipu-staging/amdnpu
>>>>
>>>>
>>>> Changes since v1:
>>>> - Remove some inline defines
>>>> - Minor changes based code review comments
>>>>
>>>> Lizhi Hou (10):
>>>> accel/amdxdna: Add a new driver for AMD AI Engine
>>>> accel/amdxdna: Support hardware mailbox
>>>> accel/amdxdna: Add hardware resource solver
>>>> accel/amdxdna: Add hardware context
>>>> accel/amdxdna: Add GEM buffer object management
>>>> accel/amdxdna: Add command execution
>>>> accel/amdxdna: Add suspend and resume
>>>> accel/amdxdna: Add error handling
>>>> accel/amdxdna: Add query functions
>>>> accel/amdxdna: Add firmware debug buffer support
>>>>
>>>> MAINTAINERS | 9 +
>>>> drivers/accel/Kconfig | 1 +
>>>> drivers/accel/Makefile | 1 +
>>>> drivers/accel/amdxdna/Kconfig | 15 +
>>>> drivers/accel/amdxdna/Makefile | 22 +
>>>> drivers/accel/amdxdna/TODO | 4 +
>>>> drivers/accel/amdxdna/aie2_ctx.c | 949
>>>> ++++++++++++++++++
>>>> drivers/accel/amdxdna/aie2_error.c | 349 +++++++
>>>> drivers/accel/amdxdna/aie2_message.c | 775 ++++++++++++++
>>>> drivers/accel/amdxdna/aie2_msg_priv.h | 372 +++++++
>>>> drivers/accel/amdxdna/aie2_pci.c | 756 ++++++++++++++
>>>> drivers/accel/amdxdna/aie2_pci.h | 264 +++++
>>>> drivers/accel/amdxdna/aie2_psp.c | 137 +++
>>>> drivers/accel/amdxdna/aie2_smu.c | 112 +++
>>>> drivers/accel/amdxdna/aie2_solver.c | 329 ++++++
>>>> drivers/accel/amdxdna/aie2_solver.h | 156 +++
>>>> drivers/accel/amdxdna/amdxdna_ctx.c | 597 +++++++++++
>>>> drivers/accel/amdxdna/amdxdna_ctx.h | 165 +++
>>>> drivers/accel/amdxdna/amdxdna_drm.c | 172 ++++
>>>> drivers/accel/amdxdna/amdxdna_drm.h | 114 +++
>>>> drivers/accel/amdxdna/amdxdna_gem.c | 700 +++++++++++++
>>>> drivers/accel/amdxdna/amdxdna_gem.h | 73 ++
>>>> drivers/accel/amdxdna/amdxdna_mailbox.c | 582 +++++++++++
>>>> drivers/accel/amdxdna/amdxdna_mailbox.h | 124 +++
>>>> .../accel/amdxdna/amdxdna_mailbox_helper.c | 50 +
>>>> .../accel/amdxdna/amdxdna_mailbox_helper.h | 43 +
>>>> drivers/accel/amdxdna/amdxdna_pci_drv.c | 234 +++++
>>>> drivers/accel/amdxdna/amdxdna_pci_drv.h | 31 +
>>>> drivers/accel/amdxdna/amdxdna_sysfs.c | 58 ++
>>>> drivers/accel/amdxdna/npu1_regs.c | 94 ++
>>>> drivers/accel/amdxdna/npu2_regs.c | 111 ++
>>>> drivers/accel/amdxdna/npu4_regs.c | 111 ++
>>>> drivers/accel/amdxdna/npu5_regs.c | 111 ++
>>>> include/trace/events/amdxdna.h | 101 ++
>>>> include/uapi/drm/amdxdna_accel.h | 456 +++++++++
>>>> 35 files changed, 8178 insertions(+)
>>>> create mode 100644 drivers/accel/amdxdna/Kconfig
>>>> create mode 100644 drivers/accel/amdxdna/Makefile
>>>> create mode 100644 drivers/accel/amdxdna/TODO
>>>> create mode 100644 drivers/accel/amdxdna/aie2_ctx.c
>>>> create mode 100644 drivers/accel/amdxdna/aie2_error.c
>>>> create mode 100644 drivers/accel/amdxdna/aie2_message.c
>>>> create mode 100644 drivers/accel/amdxdna/aie2_msg_priv.h
>>>> create mode 100644 drivers/accel/amdxdna/aie2_pci.c
>>>> create mode 100644 drivers/accel/amdxdna/aie2_pci.h
>>>> create mode 100644 drivers/accel/amdxdna/aie2_psp.c
>>>> create mode 100644 drivers/accel/amdxdna/aie2_smu.c
>>>> create mode 100644 drivers/accel/amdxdna/aie2_solver.c
>>>> create mode 100644 drivers/accel/amdxdna/aie2_solver.h
>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_ctx.c
>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_ctx.h
>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_drm.c
>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_drm.h
>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_gem.c
>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_gem.h
>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_mailbox.c
>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_mailbox.h
>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_mailbox_helper.c
>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_mailbox_helper.h
>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_pci_drv.c
>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_pci_drv.h
>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_sysfs.c
>>>> create mode 100644 drivers/accel/amdxdna/npu1_regs.c
>>>> create mode 100644 drivers/accel/amdxdna/npu2_regs.c
>>>> create mode 100644 drivers/accel/amdxdna/npu4_regs.c
>>>> create mode 100644 drivers/accel/amdxdna/npu5_regs.c
>>>> create mode 100644 include/trace/events/amdxdna.h
>>>> create mode 100644 include/uapi/drm/amdxdna_accel.h
>>>>
>>>
>>> No Documentation?
>>
>> Is it ok to add a work item to TODO and add documentation in later
>> patches?
>
> I beleive best practice would be to add Documnetation in the same
> patch/series that adds the functionality. I'm not expecting
> Documentation for items not implemented in this series, however I
> think describing the product/architecture/other high level topics
> would help put the code in context during review.
>
> It does seem like the AMD GPU driver had a lot of documentation, which
> makes the lack of documentation for the AMD Accel driver particularly
> odd.
Ok. We will work on the document
Thanks,
Lizhi
Powered by blists - more mailing lists