[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <de895a18-e40f-4996-b799-0a957bd3ca48@quicinc.com>
Date: Fri, 25 Oct 2024 11:55:07 -0600
From: Jeffrey Hugo <quic_jhugo@...cinc.com>
To: Lizhi Hou <lizhi.hou@....com>, <ogabbay@...nel.org>,
<dri-devel@...ts.freedesktop.org>
CC: <linux-kernel@...r.kernel.org>, <min.ma@....com>, <max.zhen@....com>,
<sonal.santan@....com>, <king.tam@....com>
Subject: Re: [PATCH V5 00/10] AMD XDNA driver
On 10/21/2024 10:19 AM, Lizhi Hou wrote:
> This patchset introduces a new Linux Kernel Driver, amdxdna for AMD NPUs.
> The driver is based on Linux accel subsystem.
>
> NPU (Neural Processing Unit) is an AI inference accelerator integrated
> into AMD client CPUs. NPU enables efficient execution of Machine Learning
> applications like CNNs, LLMs, etc. NPU is based on AMD XDNA
> architecture [1].
>
> AMD NPU consists of the following components:
>
> - Tiled array of AMD AI Engine processors.
> - Micro Controller which runs the NPU Firmware responsible for
> command processing, AIE array configuration, and execution management.
> - PCI EP for host control of the NPU device.
> - Interconnect for connecting the NPU components together.
> - SRAM for use by the NPU Firmware.
> - Address translation hardware for protected host memory access by the
> NPU.
>
> NPU supports multiple concurrent fully isolated contexts. Concurrent
> contexts may be bound to AI Engine array spatially and or temporarily.
>
> The driver is licensed under GPL-2.0 except for UAPI header which is
> licensed GPL-2.0 WITH Linux-syscall-note.
>
> User mode driver stack consists of XRT [2] and AMD AIE Plugin for IREE [3].
>
> The firmware for the NPU is distributed as a closed source binary, and has
> already been pushed to the DRM firmware repository [4].
>
> [1]https://www.amd.com/en/technologies/xdna.html
> [2]https://github.com/Xilinx/XRT
> [3]https://github.com/nod-ai/iree-amd-aie
> [4]https://gitlab.freedesktop.org/drm/firmware/-/tree/amd-ipu-staging/amdnpu
>
> Changes since v4:
> - Fix lockdep errors
> - Use __u* structure for struct aie_error
One nit, when you send the next version would you please either To: or
Cc: me on the entire series? I only get pieces in my inbox which is
mildly annoying on my end.
Looks like we are getting close here. One procedural question I have,
do you have commit permissions to drm-misc?
I applied the series to drm-misc-next and tried to build. Got the
following errors -
CC [M] drivers/accel/amdxdna/aie2_ctx.o
CC [M] drivers/accel/amdxdna/aie2_error.o
CC [M] drivers/accel/amdxdna/aie2_message.o
CC [M] drivers/accel/amdxdna/aie2_pci.o
CC [M] drivers/accel/amdxdna/aie2_psp.o
CC [M] drivers/accel/amdxdna/aie2_smu.o
CC [M] drivers/accel/amdxdna/aie2_solver.o
CC [M] drivers/accel/amdxdna/amdxdna_ctx.o
CC [M] drivers/accel/amdxdna/amdxdna_gem.o
CC [M] drivers/accel/amdxdna/amdxdna_mailbox.o
CC [M] drivers/accel/amdxdna/amdxdna_mailbox_helper.o
CC [M] drivers/accel/amdxdna/amdxdna_pci_drv.o
CC [M] drivers/accel/amdxdna/amdxdna_sysfs.o
CC [M] drivers/accel/amdxdna/npu1_regs.o
CC [M] drivers/accel/amdxdna/npu2_regs.o
CC [M] drivers/accel/amdxdna/npu4_regs.o
CC [M] drivers/accel/amdxdna/npu5_regs.o
AR drivers/base/firmware_loader/built-in.a
AR drivers/base/built-in.a
In file included from drivers/accel/amdxdna/aie2_message.c:19:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration
of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
| ^~~~~~~~~
In file included from drivers/accel/amdxdna/amdxdna_gem.c:15:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration
of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
| ^~~~~~~~~
In file included from drivers/accel/amdxdna/aie2_psp.c:11:
drivers/accel/amdxdna/aie2_psp.c: In function ‘psp_exec’:
drivers/accel/amdxdna/aie2_psp.c:62:34: error: implicit declaration of
function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
62 | FIELD_GET(PSP_STATUS_READY,
ready),
| ^~~~~~~~~
./include/linux/iopoll.h:47:21: note: in definition of macro
‘read_poll_timeout’
47 | if (cond) \
| ^~~~
drivers/accel/amdxdna/aie2_psp.c:61:15: note: in expansion of macro
‘readx_poll_timeout’
61 | ret = readx_poll_timeout(readl, PSP_REG(psp,
PSP_STATUS_REG), ready,
| ^~~~~~~~~~~~~~~~~~
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration
of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
| ^~~~~~~~~~
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration
of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
| ^~~~~~~~~~
In file included from drivers/accel/amdxdna/aie2_pci.c:22:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration
of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
| ^~~~~~~~~
In file included from drivers/accel/amdxdna/aie2_ctx.c:18:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration
of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
| ^~~~~~~~~
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration
of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
| ^~~~~~~~~~
In file included from drivers/accel/amdxdna/amdxdna_ctx.c:16:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration
of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
| ^~~~~~~~~
cc1: all warnings being treated as errors
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration
of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
| ^~~~~~~~~~
drivers/accel/amdxdna/aie2_ctx.c: In function ‘aie2_hwctx_restart’:
drivers/accel/amdxdna/aie2_ctx.c:114:9: error: too few arguments to
function ‘drm_sched_start’
114 | drm_sched_start(&hwctx->priv->sched);
| ^~~~~~~~~~~~~~~
In file included from ./include/trace/events/amdxdna.h:12,
from drivers/accel/amdxdna/aie2_ctx.c:13:
./include/drm/gpu_scheduler.h:593:6: note: declared here
593 | void drm_sched_start(struct drm_gpu_scheduler *sched, int errno);
| ^~~~~~~~~~~~~~~
make[5]: *** [scripts/Makefile.build:229:
drivers/accel/amdxdna/aie2_psp.o] Error 1
make[5]: *** Waiting for unfinished jobs....
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration
of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
| ^~~~~~~~~~
In file included from drivers/accel/amdxdna/amdxdna_pci_drv.c:18:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration
of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
112 | return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
| ^~~~~~~~~
cc1: all warnings being treated as errors
make[5]: *** [scripts/Makefile.build:229:
drivers/accel/amdxdna/aie2_ctx.o] Error 1
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration
of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
121 | cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
| ^~~~~~~~~~
drivers/accel/amdxdna/amdxdna_mailbox.c: In function
‘xdna_mailbox_send_msg’:
drivers/accel/amdxdna/amdxdna_mailbox.c:444:26: error: implicit
declaration of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
444 | header->sz_ver = FIELD_PREP(MSG_BODY_SZ, msg->send_size) |
| ^~~~~~~~~~
You also have the following checkpatch issues -
WARNING: 'Disalbe' may be misspelled - perhaps 'Disable'?
#1646: FILE: drivers/accel/amdxdna/amdxdna_mailbox.c:553:
+ /* Disalbe an irq and wait. This might sleep. */
^^^^^^^
WARNING: 'splite' may be misspelled - perhaps 'split'?
#1695: FILE: drivers/accel/amdxdna/amdxdna_mailbox.h:21:
+ * The mailbox will splite the sending data in to multiple firmware
message if
^^^^^^
WARNING: 'miliseconds' may be misspelled - perhaps 'milliseconds'?
#1875: FILE: drivers/accel/amdxdna/amdxdna_mailbox_helper.h:9:
+#define TX_TIMEOUT 2000 /* miliseconds */
^^^^^^^^^^^
WARNING: 'miliseconds' may be misspelled - perhaps 'milliseconds'?
#1876: FILE: drivers/accel/amdxdna/amdxdna_mailbox_helper.h:10:
+#define RX_TIMEOUT 5000 /* miliseconds */
^^^^^^^^^^^
total: 0 errors, 4 warnings, 0 checks, 1916 lines checked
NOTE: For some of the reported defects, checkpatch may be able to
mechanically convert to the typical style using --fix or
--fix-inplace.
0003-accel-amdxdna-Support-hardware-mailbox.patch has style problems,
please review.
0007-accel-amdxdna-Add-command-execution.patch
----------------------------------------------
WARNING: 'miliseconds' may be misspelled - perhaps 'milliseconds'?
#59: FILE: drivers/accel/amdxdna/aie2_ctx.c:27:
+#define HWCTX_MAX_TIMEOUT 60000 /* miliseconds */
^^^^^^^^^^^
WARNING: 'reverve' may be misspelled - perhaps 'reserve'?
#612: FILE: drivers/accel/amdxdna/aie2_ctx.c:779:
+ XDNA_WARN(xdna, "Failed to reverve fence, ret %d", ret);
^^^^^^^
WARNING: 'Exectuion' may be misspelled - perhaps 'Execution'?
#1899: FILE: drivers/accel/amdxdna/amdxdna_pci_drv.c:139:
+ /* Exectuion */
^^^^^^^^^
WARNING: 'exectuion' may be misspelled - perhaps 'execution'?
#2113: FILE: include/uapi/drm/amdxdna_accel.h:239:
+ * struct amdxdna_drm_wait_cmd - Wait exectuion command.
^^^^^^^^^
total: 0 errors, 10 warnings, 0 checks, 1983 lines checked
NOTE: For some of the reported defects, checkpatch may be able to
mechanically convert to the typical style using --fix or
--fix-inplace.
0007-accel-amdxdna-Add-command-execution.patch has style problems,
please review.
0008-accel-amdxdna-Add-suspend-and-resume.patch
-----------------------------------------------
WARNING: 'miliseconds' may be misspelled - perhaps 'milliseconds'?
#163: FILE: drivers/accel/amdxdna/amdxdna_pci_drv.c:22:
+#define AMDXDNA_AUTOSUSPEND_DELAY 5000 /* miliseconds */
^^^^^^^^^^^
total: 0 errors, 1 warnings, 0 checks, 302 lines checked
NOTE: For some of the reported defects, checkpatch may be able to
mechanically convert to the typical style using --fix or
--fix-inplace.
0008-accel-amdxdna-Add-suspend-and-resume.patch has style problems,
please review.
-Jeff
Powered by blists - more mailing lists