[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7a8c001b-a535-0845-1679-0c1eccb6a082@amd.com>
Date: Tue, 3 Sep 2024 10:50:11 -0700
From: Lizhi Hou <lizhi.hou@....com>
To: Jeffrey Hugo <quic_jhugo@...cinc.com>, <ogabbay@...nel.org>,
<dri-devel@...ts.freedesktop.org>
CC: <linux-kernel@...r.kernel.org>, <min.ma@....com>, <max.zhen@....com>,
<sonal.santan@....com>, <king.tam@....com>
Subject: Re: [PATCH V2 00/10] AMD XDNA driver
Hi Jeffrey,
I have completed the code changes based on your comments. And we are
still working on the documentation.
Should I send out another patch set with only the code changes for your
to review? Or you would wait for the documentation to review together?
Thanks,
Lizhi
On 8/14/24 13:06, Lizhi Hou wrote:
>
> On 8/14/24 11:49, Jeffrey Hugo wrote:
>> On 8/12/2024 12:16 PM, Lizhi Hou wrote:
>>>
>>> On 8/9/24 08:21, Jeffrey Hugo wrote:
>>>> On 8/5/2024 11:39 AM, Lizhi Hou wrote:
>>>>> This patchset introduces a new Linux Kernel Driver, amdxdna for
>>>>> AMD NPUs.
>>>>> The driver is based on Linux accel subsystem.
>>>>>
>>>>> NPU (Neural Processing Unit) is an AI inference accelerator
>>>>> integrated
>>>>> into AMD client CPUs. NPU enables efficient execution of Machine
>>>>> Learning
>>>>> applications like CNNs, LLMs, etc. NPU is based on AMD XDNA
>>>>> architecture [1].
>>>>>
>>>>> AMD NPU consists of the following components:
>>>>>
>>>>> - Tiled array of AMD AI Engine processors.
>>>>> - Micro Controller which runs the NPU Firmware responsible for
>>>>> command processing, AIE array configuration, and execution
>>>>> management.
>>>>> - PCI EP for host control of the NPU device.
>>>>> - Interconnect for connecting the NPU components together.
>>>>> - SRAM for use by the NPU Firmware.
>>>>> - Address translation hardware for protected host memory access
>>>>> by the
>>>>> NPU.
>>>>>
>>>>> NPU supports multiple concurrent fully isolated contexts. Concurrent
>>>>> contexts may be bound to AI Engine array spatially and or
>>>>> temporarily.
>>>>>
>>>>> The driver is licensed under GPL-2.0 except for UAPI header which is
>>>>> licensed GPL-2.0 WITH Linux-syscall-note.
>>>>>
>>>>> User mode driver stack consists of XRT [2] and AMD AIE Plugin for
>>>>> IREE [3].
>>>>
>>>> Is there a special branch with the code? I don't see any of the
>>>> uAPI in either project when searching for the ioctl codes or ioctl
>>>> structures.
>>>
>>> Please see git repo: https://github.com/amd/xdna-driver
>>>
>>> This contains the out tree driver and shim code which interact with
>>> driver. E.g.
>>>
>>> https://github.com/amd/xdna-driver/blob/main/src/shim/bo.cpp#L18
>>
>> Ok, I need to have a look at this. Long term is the plan to move the
>> shim to the XRT repo once the driver is merged upstream?
> Yes.
>>
>>>
>>>>
>>>>>
>>>>> The firmware for the NPU is distributed as a closed source binary,
>>>>> and has
>>>>> already been pushed to the DRM firmware repository [4].
>>>>>
>>>>> [1] https://www.amd.com/en/technologies/xdna.html
>>>>> [2] https://github.com/Xilinx/XRT
>>>>> [3] https://github.com/nod-ai/iree-amd-aie
>>>>> [4]
>>>>> https://gitlab.freedesktop.org/drm/firmware/-/tree/amd-ipu-staging/amdnpu
>>>>>
>>>>>
>>>>> Changes since v1:
>>>>> - Remove some inline defines
>>>>> - Minor changes based code review comments
>>>>>
>>>>> Lizhi Hou (10):
>>>>> accel/amdxdna: Add a new driver for AMD AI Engine
>>>>> accel/amdxdna: Support hardware mailbox
>>>>> accel/amdxdna: Add hardware resource solver
>>>>> accel/amdxdna: Add hardware context
>>>>> accel/amdxdna: Add GEM buffer object management
>>>>> accel/amdxdna: Add command execution
>>>>> accel/amdxdna: Add suspend and resume
>>>>> accel/amdxdna: Add error handling
>>>>> accel/amdxdna: Add query functions
>>>>> accel/amdxdna: Add firmware debug buffer support
>>>>>
>>>>> MAINTAINERS | 9 +
>>>>> drivers/accel/Kconfig | 1 +
>>>>> drivers/accel/Makefile | 1 +
>>>>> drivers/accel/amdxdna/Kconfig | 15 +
>>>>> drivers/accel/amdxdna/Makefile | 22 +
>>>>> drivers/accel/amdxdna/TODO | 4 +
>>>>> drivers/accel/amdxdna/aie2_ctx.c | 949
>>>>> ++++++++++++++++++
>>>>> drivers/accel/amdxdna/aie2_error.c | 349 +++++++
>>>>> drivers/accel/amdxdna/aie2_message.c | 775 ++++++++++++++
>>>>> drivers/accel/amdxdna/aie2_msg_priv.h | 372 +++++++
>>>>> drivers/accel/amdxdna/aie2_pci.c | 756 ++++++++++++++
>>>>> drivers/accel/amdxdna/aie2_pci.h | 264 +++++
>>>>> drivers/accel/amdxdna/aie2_psp.c | 137 +++
>>>>> drivers/accel/amdxdna/aie2_smu.c | 112 +++
>>>>> drivers/accel/amdxdna/aie2_solver.c | 329 ++++++
>>>>> drivers/accel/amdxdna/aie2_solver.h | 156 +++
>>>>> drivers/accel/amdxdna/amdxdna_ctx.c | 597 +++++++++++
>>>>> drivers/accel/amdxdna/amdxdna_ctx.h | 165 +++
>>>>> drivers/accel/amdxdna/amdxdna_drm.c | 172 ++++
>>>>> drivers/accel/amdxdna/amdxdna_drm.h | 114 +++
>>>>> drivers/accel/amdxdna/amdxdna_gem.c | 700 +++++++++++++
>>>>> drivers/accel/amdxdna/amdxdna_gem.h | 73 ++
>>>>> drivers/accel/amdxdna/amdxdna_mailbox.c | 582 +++++++++++
>>>>> drivers/accel/amdxdna/amdxdna_mailbox.h | 124 +++
>>>>> .../accel/amdxdna/amdxdna_mailbox_helper.c | 50 +
>>>>> .../accel/amdxdna/amdxdna_mailbox_helper.h | 43 +
>>>>> drivers/accel/amdxdna/amdxdna_pci_drv.c | 234 +++++
>>>>> drivers/accel/amdxdna/amdxdna_pci_drv.h | 31 +
>>>>> drivers/accel/amdxdna/amdxdna_sysfs.c | 58 ++
>>>>> drivers/accel/amdxdna/npu1_regs.c | 94 ++
>>>>> drivers/accel/amdxdna/npu2_regs.c | 111 ++
>>>>> drivers/accel/amdxdna/npu4_regs.c | 111 ++
>>>>> drivers/accel/amdxdna/npu5_regs.c | 111 ++
>>>>> include/trace/events/amdxdna.h | 101 ++
>>>>> include/uapi/drm/amdxdna_accel.h | 456 +++++++++
>>>>> 35 files changed, 8178 insertions(+)
>>>>> create mode 100644 drivers/accel/amdxdna/Kconfig
>>>>> create mode 100644 drivers/accel/amdxdna/Makefile
>>>>> create mode 100644 drivers/accel/amdxdna/TODO
>>>>> create mode 100644 drivers/accel/amdxdna/aie2_ctx.c
>>>>> create mode 100644 drivers/accel/amdxdna/aie2_error.c
>>>>> create mode 100644 drivers/accel/amdxdna/aie2_message.c
>>>>> create mode 100644 drivers/accel/amdxdna/aie2_msg_priv.h
>>>>> create mode 100644 drivers/accel/amdxdna/aie2_pci.c
>>>>> create mode 100644 drivers/accel/amdxdna/aie2_pci.h
>>>>> create mode 100644 drivers/accel/amdxdna/aie2_psp.c
>>>>> create mode 100644 drivers/accel/amdxdna/aie2_smu.c
>>>>> create mode 100644 drivers/accel/amdxdna/aie2_solver.c
>>>>> create mode 100644 drivers/accel/amdxdna/aie2_solver.h
>>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_ctx.c
>>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_ctx.h
>>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_drm.c
>>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_drm.h
>>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_gem.c
>>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_gem.h
>>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_mailbox.c
>>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_mailbox.h
>>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_mailbox_helper.c
>>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_mailbox_helper.h
>>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_pci_drv.c
>>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_pci_drv.h
>>>>> create mode 100644 drivers/accel/amdxdna/amdxdna_sysfs.c
>>>>> create mode 100644 drivers/accel/amdxdna/npu1_regs.c
>>>>> create mode 100644 drivers/accel/amdxdna/npu2_regs.c
>>>>> create mode 100644 drivers/accel/amdxdna/npu4_regs.c
>>>>> create mode 100644 drivers/accel/amdxdna/npu5_regs.c
>>>>> create mode 100644 include/trace/events/amdxdna.h
>>>>> create mode 100644 include/uapi/drm/amdxdna_accel.h
>>>>>
>>>>
>>>> No Documentation?
>>>
>>> Is it ok to add a work item to TODO and add documentation in later
>>> patches?
>>
>> I beleive best practice would be to add Documnetation in the same
>> patch/series that adds the functionality. I'm not expecting
>> Documentation for items not implemented in this series, however I
>> think describing the product/architecture/other high level topics
>> would help put the code in context during review.
>>
>> It does seem like the AMD GPU driver had a lot of documentation,
>> which makes the lack of documentation for the AMD Accel driver
>> particularly odd.
>
> Ok. We will work on the document
>
>
> Thanks,
>
> Lizhi
>
Powered by blists - more mailing lists