lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c0a66ae8-43ec-257f-92c5-6ecbfcd45c1a@amd.com>
Date: Wed, 14 Aug 2024 13:06:49 -0700
From: Lizhi Hou <lizhi.hou@....com>
To: Jeffrey Hugo <quic_jhugo@...cinc.com>, <ogabbay@...nel.org>,
	<dri-devel@...ts.freedesktop.org>
CC: <linux-kernel@...r.kernel.org>, <min.ma@....com>, <max.zhen@....com>,
	<sonal.santan@....com>, <king.tam@....com>
Subject: Re: [PATCH V2 00/10] AMD XDNA driver


On 8/14/24 11:49, Jeffrey Hugo wrote:
> On 8/12/2024 12:16 PM, Lizhi Hou wrote:
>>
>> On 8/9/24 08:21, Jeffrey Hugo wrote:
>>> On 8/5/2024 11:39 AM, Lizhi Hou wrote:
>>>> This patchset introduces a new Linux Kernel Driver, amdxdna for AMD 
>>>> NPUs.
>>>> The driver is based on Linux accel subsystem.
>>>>
>>>> NPU (Neural Processing Unit) is an AI inference accelerator integrated
>>>> into AMD client CPUs. NPU enables efficient execution of Machine 
>>>> Learning
>>>> applications like CNNs, LLMs, etc.  NPU is based on AMD XDNA
>>>> architecture [1].
>>>>
>>>> AMD NPU consists of the following components:
>>>>
>>>>    - Tiled array of AMD AI Engine processors.
>>>>    - Micro Controller which runs the NPU Firmware responsible for
>>>>      command processing, AIE array configuration, and execution 
>>>> management.
>>>>    - PCI EP for host control of the NPU device.
>>>>    - Interconnect for connecting the NPU components together.
>>>>    - SRAM for use by the NPU Firmware.
>>>>    - Address translation hardware for protected host memory access 
>>>> by the
>>>>      NPU.
>>>>
>>>> NPU supports multiple concurrent fully isolated contexts. Concurrent
>>>> contexts may be bound to AI Engine array spatially and or temporarily.
>>>>
>>>> The driver is licensed under GPL-2.0 except for UAPI header which is
>>>> licensed GPL-2.0 WITH Linux-syscall-note.
>>>>
>>>> User mode driver stack consists of XRT [2] and AMD AIE Plugin for 
>>>> IREE [3].
>>>
>>> Is there a special branch with the code?  I don't see any of the 
>>> uAPI in either project when searching for the ioctl codes or ioctl 
>>> structures.
>>
>> Please see git repo: https://github.com/amd/xdna-driver
>>
>> This contains the out tree driver and shim code which interact with 
>> driver. E.g.
>>
>> https://github.com/amd/xdna-driver/blob/main/src/shim/bo.cpp#L18
>
> Ok, I need to have a look at this.  Long term is the plan to move the 
> shim to the XRT repo once the driver is merged upstream?
Yes.
>
>>
>>>
>>>>
>>>> The firmware for the NPU is distributed as a closed source binary, 
>>>> and has
>>>> already been pushed to the DRM firmware repository [4].
>>>>
>>>> [1] https://www.amd.com/en/technologies/xdna.html
>>>> [2] https://github.com/Xilinx/XRT
>>>> [3] https://github.com/nod-ai/iree-amd-aie
>>>> [4] 
>>>> https://gitlab.freedesktop.org/drm/firmware/-/tree/amd-ipu-staging/amdnpu 
>>>>
>>>>
>>>> Changes since v1:
>>>> - Remove some inline defines
>>>> - Minor changes based code review comments
>>>>
>>>> Lizhi Hou (10):
>>>>    accel/amdxdna: Add a new driver for AMD AI Engine
>>>>    accel/amdxdna: Support hardware mailbox
>>>>    accel/amdxdna: Add hardware resource solver
>>>>    accel/amdxdna: Add hardware context
>>>>    accel/amdxdna: Add GEM buffer object management
>>>>    accel/amdxdna: Add command execution
>>>>    accel/amdxdna: Add suspend and resume
>>>>    accel/amdxdna: Add error handling
>>>>    accel/amdxdna: Add query functions
>>>>    accel/amdxdna: Add firmware debug buffer support
>>>>
>>>>   MAINTAINERS                                   |   9 +
>>>>   drivers/accel/Kconfig                         |   1 +
>>>>   drivers/accel/Makefile                        |   1 +
>>>>   drivers/accel/amdxdna/Kconfig                 |  15 +
>>>>   drivers/accel/amdxdna/Makefile                |  22 +
>>>>   drivers/accel/amdxdna/TODO                    |   4 +
>>>>   drivers/accel/amdxdna/aie2_ctx.c              | 949 
>>>> ++++++++++++++++++
>>>>   drivers/accel/amdxdna/aie2_error.c            | 349 +++++++
>>>>   drivers/accel/amdxdna/aie2_message.c          | 775 ++++++++++++++
>>>>   drivers/accel/amdxdna/aie2_msg_priv.h         | 372 +++++++
>>>>   drivers/accel/amdxdna/aie2_pci.c              | 756 ++++++++++++++
>>>>   drivers/accel/amdxdna/aie2_pci.h              | 264 +++++
>>>>   drivers/accel/amdxdna/aie2_psp.c              | 137 +++
>>>>   drivers/accel/amdxdna/aie2_smu.c              | 112 +++
>>>>   drivers/accel/amdxdna/aie2_solver.c           | 329 ++++++
>>>>   drivers/accel/amdxdna/aie2_solver.h           | 156 +++
>>>>   drivers/accel/amdxdna/amdxdna_ctx.c           | 597 +++++++++++
>>>>   drivers/accel/amdxdna/amdxdna_ctx.h           | 165 +++
>>>>   drivers/accel/amdxdna/amdxdna_drm.c           | 172 ++++
>>>>   drivers/accel/amdxdna/amdxdna_drm.h           | 114 +++
>>>>   drivers/accel/amdxdna/amdxdna_gem.c           | 700 +++++++++++++
>>>>   drivers/accel/amdxdna/amdxdna_gem.h           |  73 ++
>>>>   drivers/accel/amdxdna/amdxdna_mailbox.c       | 582 +++++++++++
>>>>   drivers/accel/amdxdna/amdxdna_mailbox.h       | 124 +++
>>>>   .../accel/amdxdna/amdxdna_mailbox_helper.c    |  50 +
>>>>   .../accel/amdxdna/amdxdna_mailbox_helper.h    |  43 +
>>>>   drivers/accel/amdxdna/amdxdna_pci_drv.c       | 234 +++++
>>>>   drivers/accel/amdxdna/amdxdna_pci_drv.h       |  31 +
>>>>   drivers/accel/amdxdna/amdxdna_sysfs.c         |  58 ++
>>>>   drivers/accel/amdxdna/npu1_regs.c             |  94 ++
>>>>   drivers/accel/amdxdna/npu2_regs.c             | 111 ++
>>>>   drivers/accel/amdxdna/npu4_regs.c             | 111 ++
>>>>   drivers/accel/amdxdna/npu5_regs.c             | 111 ++
>>>>   include/trace/events/amdxdna.h                | 101 ++
>>>>   include/uapi/drm/amdxdna_accel.h              | 456 +++++++++
>>>>   35 files changed, 8178 insertions(+)
>>>>   create mode 100644 drivers/accel/amdxdna/Kconfig
>>>>   create mode 100644 drivers/accel/amdxdna/Makefile
>>>>   create mode 100644 drivers/accel/amdxdna/TODO
>>>>   create mode 100644 drivers/accel/amdxdna/aie2_ctx.c
>>>>   create mode 100644 drivers/accel/amdxdna/aie2_error.c
>>>>   create mode 100644 drivers/accel/amdxdna/aie2_message.c
>>>>   create mode 100644 drivers/accel/amdxdna/aie2_msg_priv.h
>>>>   create mode 100644 drivers/accel/amdxdna/aie2_pci.c
>>>>   create mode 100644 drivers/accel/amdxdna/aie2_pci.h
>>>>   create mode 100644 drivers/accel/amdxdna/aie2_psp.c
>>>>   create mode 100644 drivers/accel/amdxdna/aie2_smu.c
>>>>   create mode 100644 drivers/accel/amdxdna/aie2_solver.c
>>>>   create mode 100644 drivers/accel/amdxdna/aie2_solver.h
>>>>   create mode 100644 drivers/accel/amdxdna/amdxdna_ctx.c
>>>>   create mode 100644 drivers/accel/amdxdna/amdxdna_ctx.h
>>>>   create mode 100644 drivers/accel/amdxdna/amdxdna_drm.c
>>>>   create mode 100644 drivers/accel/amdxdna/amdxdna_drm.h
>>>>   create mode 100644 drivers/accel/amdxdna/amdxdna_gem.c
>>>>   create mode 100644 drivers/accel/amdxdna/amdxdna_gem.h
>>>>   create mode 100644 drivers/accel/amdxdna/amdxdna_mailbox.c
>>>>   create mode 100644 drivers/accel/amdxdna/amdxdna_mailbox.h
>>>>   create mode 100644 drivers/accel/amdxdna/amdxdna_mailbox_helper.c
>>>>   create mode 100644 drivers/accel/amdxdna/amdxdna_mailbox_helper.h
>>>>   create mode 100644 drivers/accel/amdxdna/amdxdna_pci_drv.c
>>>>   create mode 100644 drivers/accel/amdxdna/amdxdna_pci_drv.h
>>>>   create mode 100644 drivers/accel/amdxdna/amdxdna_sysfs.c
>>>>   create mode 100644 drivers/accel/amdxdna/npu1_regs.c
>>>>   create mode 100644 drivers/accel/amdxdna/npu2_regs.c
>>>>   create mode 100644 drivers/accel/amdxdna/npu4_regs.c
>>>>   create mode 100644 drivers/accel/amdxdna/npu5_regs.c
>>>>   create mode 100644 include/trace/events/amdxdna.h
>>>>   create mode 100644 include/uapi/drm/amdxdna_accel.h
>>>>
>>>
>>> No Documentation?
>>
>> Is it ok to add a work item to TODO and add documentation in later 
>> patches?
>
> I beleive best practice would be to add Documnetation in the same 
> patch/series that adds the functionality.  I'm not expecting 
> Documentation for items not implemented in this series, however I 
> think describing the product/architecture/other high level topics 
> would help put the code in context during review.
>
> It does seem like the AMD GPU driver had a lot of documentation, which 
> makes the lack of documentation for the AMD Accel driver particularly 
> odd.

Ok.  We will work on the document


Thanks,

Lizhi


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ