linux-kernel - Re: [PATCH V3 01/11] accel/amdxdna: Add documentation for AMD NPU accelerator driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1a36698c-a0dc-49d0-39fa-8c6823b4a9ed@amd.com>
Date: Fri, 4 Oct 2024 17:27:03 -0700
From: Lizhi Hou <lizhi.hou@....com>
To: Jeffrey Hugo <quic_jhugo@...cinc.com>, <ogabbay@...nel.org>,
	<dri-devel@...ts.freedesktop.org>
CC: <linux-kernel@...r.kernel.org>, <min.ma@....com>, <max.zhen@....com>,
	<sonal.santan@....com>, <king.tam@....com>
Subject: Re: [PATCH V3 01/11] accel/amdxdna: Add documentation for AMD NPU
 accelerator driver


On 10/4/24 10:06, Jeffrey Hugo wrote:
> On 9/11/2024 12:05 PM, Lizhi Hou wrote:
>> AMD NPU (Neural Processing Unit) is a multi-user AI inference 
>> accelerator
>> integrated into AMD client APU. NPU enables efficient execution of 
>> Machine
>> Learning applications like CNN, LLM, etc. NPU is based on AMD XDNA
>> Architecture. NPU is managed by amdxdna driver.
>>
>> Co-developed-by: Sonal Santan <sonal.santan@....com>
>> Signed-off-by: Sonal Santan <sonal.santan@....com>
>> Signed-off-by: Lizhi Hou <lizhi.hou@....com>
>> ---
>>   Documentation/accel/amdxdna/amdnpu.rst | 283 +++++++++++++++++++++++++
>>   Documentation/accel/amdxdna/index.rst  |  11 +
>>   Documentation/accel/index.rst          |   1 +
>>   3 files changed, 295 insertions(+)
>>   create mode 100644 Documentation/accel/amdxdna/amdnpu.rst
>>   create mode 100644 Documentation/accel/amdxdna/index.rst
>>
>> diff --git a/Documentation/accel/amdxdna/amdnpu.rst 
>> b/Documentation/accel/amdxdna/amdnpu.rst
>> new file mode 100644
>> index 000000000000..2af3bc5b2a9e
>> --- /dev/null
>> +++ b/Documentation/accel/amdxdna/amdnpu.rst
>> @@ -0,0 +1,283 @@
>> +.. SPDX-License-Identifier: GPL-2.0-only
>> +
>> +.. include:: <isonum.txt>
>> +
>> +.. SPDX-License-Identifier: GPL-2.0-only
>
> SPDX twice?
I will remove one.
>
>> +
>> +=========
>> + AMD NPU
>> +=========
>> +
>> +:Copyright: |copy| 2024 Advanced Micro Devices, Inc.
>> +:Author: Sonal Santan <sonal.santan@....com>
>> +
>> +Overview
>> +========
>> +
>> +AMD NPU (Neural Processing Unit) is a multi-user AI inference 
>> accelerator
>> +integrated into AMD client APU. NPU enables efficient execution of 
>> Machine
>> +Learning applications like CNN, LLM, etc. NPU is based on
>> +`AMD XDNA Architecture`_. NPU is managed by **amdxdna** driver.
>> +
>> +
>> +Hardware Description
>> +====================
>> +
>> +AMD NPU consists of the following hardware components:
>> +
>> +AMD XDNA Array
>> +--------------
>> +
>> +AMD XDNA Array comprises of 2D array of compute and memory tiles 
>> built with
>> +`AMD AI Engine Technology`_. Each column has 4 rows of compute tiles 
>> and 1
>> +row of memory tile. Each compute tile contains a VLIW processor with 
>> its own
>> +dedicated program and data memory. The memory tile acts as L2 
>> memory. The 2D
>> +array can be partitioned at a column boundary creating a spatially 
>> isolated
>> +partition which can be bound to a workload context.
>> +
>> +Each column also has dedicated DMA engines to move data between host 
>> DDR and
>> +memory tile.
>> +
>> +AMD Phoenix and AMD Hawk Point client NPU have a 4x5 topology, i.e., 
>> 4 rows of
>> +compute tiles arranged into 5 columns. AMD Strix Point client APU 
>> have 4x8
>> +topology, i.e., 4 rows of compute tiles arranged into 8 columns.
>> +
>> +Shared L2 Memory
>> +................
>
> Why a line of "." instead of "-" likse elsewhere?
I will fix it.
>
>> +
>> +The single row of memory tiles create a pool of software managed on 
>> chip L2
>> +memory. DMA engines are used to move data between host DDR and 
>> memory tiles.
>> +AMD Phoenix and AMD Hawk Point NPUs have a total of 2560 KB of L2 
>> memory.
>> +AMD Strix Point NPU has a total of 4096 KB of L2 memory.
>> +
>> +Microcontroller
>> +---------------
>> +
>> +A microcontroller runs NPU Firmware which is responsible for command 
>> processing,
>> +XDNA Array partition setup, XDNA Array configuration, workload context
>> +management and workload orchestration.
>> +
>> +NPU Firmware uses a dedicated instance of an isolated non-privileged 
>> context
>> +called ERT to service each workload context. ERT is also used to 
>> execute user
>> +provided ``ctrlcode`` associated with the workload context.
>> +
>> +NPU Firmware uses a single isolated privileged context called MERT 
>> to service
>> +management commands from the amdxdna driver.
>> +
>> +Mailboxes
>> +.........
>
> Again, odd delimiter
>
>> +
>> +The microcontroller and amdxdna driver use a privileged channel for 
>> management
>> +tasks like setting up of contexts, telemetry, query, error handling, 
>> setting up
>> +user channel, etc. As mentioned before, privileged channel requests are
>> +serviced by MERT. The privileged channel is bound to a single mailbox.
>> +
>> +The microcontroller and amdxdna driver use a dedicated user channel per
>> +workload context. The user channel is primarily used for submitting 
>> work to
>> +the NPU. As mentioned before, a user channel requests are serviced 
>> by an
>> +instance of ERT. Each user channel is bound to its own dedicated 
>> mailbox.
>> +
>> +PCIe EP
>> +-------
>> +
>> +NPU is visible to the x86 as a PCIe device with multiple BARs and 
>> some MSI-X
>
> "to the x86" - feels like something is missing here.  Maybe "x86 host 
> CPU"?
Yes. I will change to "to the x86 host CPU".
>
>> +interrupt vectors. NPU uses a dedicated high bandwidth SoC level 
>> fabric for
>> +reading or writing into host memory. Each instance of ERT gets its 
>> own dedicated
>> +MSI-X interrupt. MERT gets a single instance of MSI-X interrupt.
>
> <snip>
>
>> diff --git a/Documentation/accel/amdxdna/index.rst 
>> b/Documentation/accel/amdxdna/index.rst
>> new file mode 100644
>> index 000000000000..38c16939f1fc
>> --- /dev/null
>> +++ b/Documentation/accel/amdxdna/index.rst
>> @@ -0,0 +1,11 @@
>> +.. SPDX-License-Identifier: GPL-2.0-only
>> +
>> +=====================================
>> + accel/amdxdna NPU driver
>> +=====================================
>> +
>> +The accel/amdxdna driver supports the AMD NPU (Neural Processing Unit).
>> +
>> +.. toctree::
>> +
>> +   amdnpu
>> diff --git a/Documentation/accel/index.rst 
>> b/Documentation/accel/index.rst
>> index e94a0160b6a0..0a94b6766263 100644
>> --- a/Documentation/accel/index.rst
>> +++ b/Documentation/accel/index.rst
>> @@ -9,6 +9,7 @@ Compute Accelerators
>>        introduction
>>      qaic/index
>> +   amdxdna/index
>
> I think alphabetical order makes sense to me, considering there 
> probably should be more entries added over time. This would suggest 
> that your addition should occur one line up. What do you think?

I will fix it.


Thanks,

Lizhi

>
>>     .. only::  subproject and html
>