lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <de895a18-e40f-4996-b799-0a957bd3ca48@quicinc.com>
Date: Fri, 25 Oct 2024 11:55:07 -0600
From: Jeffrey Hugo <quic_jhugo@...cinc.com>
To: Lizhi Hou <lizhi.hou@....com>, <ogabbay@...nel.org>,
        <dri-devel@...ts.freedesktop.org>
CC: <linux-kernel@...r.kernel.org>, <min.ma@....com>, <max.zhen@....com>,
        <sonal.santan@....com>, <king.tam@....com>
Subject: Re: [PATCH V5 00/10] AMD XDNA driver

On 10/21/2024 10:19 AM, Lizhi Hou wrote:
> This patchset introduces a new Linux Kernel Driver, amdxdna for AMD NPUs.
> The driver is based on Linux accel subsystem.
> 
> NPU (Neural Processing Unit) is an AI inference accelerator integrated
> into AMD client CPUs. NPU enables efficient execution of Machine Learning
> applications like CNNs, LLMs, etc.  NPU is based on AMD XDNA
> architecture [1].
> 
> AMD NPU consists of the following components:
> 
>    - Tiled array of AMD AI Engine processors.
>    - Micro Controller which runs the NPU Firmware responsible for
>      command processing, AIE array configuration, and execution management.
>    - PCI EP for host control of the NPU device.
>    - Interconnect for connecting the NPU components together.
>    - SRAM for use by the NPU Firmware.
>    - Address translation hardware for protected host memory access by the
>      NPU.
> 
> NPU supports multiple concurrent fully isolated contexts. Concurrent
> contexts may be bound to AI Engine array spatially and or temporarily.
> 
> The driver is licensed under GPL-2.0 except for UAPI header which is
> licensed GPL-2.0 WITH Linux-syscall-note.
> 
> User mode driver stack consists of XRT [2] and AMD AIE Plugin for IREE [3].
> 
> The firmware for the NPU is distributed as a closed source binary, and has
> already been pushed to the DRM firmware repository [4].
> 
> [1]https://www.amd.com/en/technologies/xdna.html
> [2]https://github.com/Xilinx/XRT
> [3]https://github.com/nod-ai/iree-amd-aie
> [4]https://gitlab.freedesktop.org/drm/firmware/-/tree/amd-ipu-staging/amdnpu
> 
> Changes since v4:
> - Fix lockdep errors
> - Use __u* structure for struct aie_error

One nit, when you send the next version would you please either To: or 
Cc: me on the entire series?  I only get pieces in my inbox which is 
mildly annoying on my end.

Looks like we are getting close here.  One procedural question I have, 
do you have commit permissions to drm-misc?

I applied the series to drm-misc-next and tried to build.  Got the 
following errors -

   CC [M]  drivers/accel/amdxdna/aie2_ctx.o
   CC [M]  drivers/accel/amdxdna/aie2_error.o
   CC [M]  drivers/accel/amdxdna/aie2_message.o
   CC [M]  drivers/accel/amdxdna/aie2_pci.o
   CC [M]  drivers/accel/amdxdna/aie2_psp.o
   CC [M]  drivers/accel/amdxdna/aie2_smu.o
   CC [M]  drivers/accel/amdxdna/aie2_solver.o
   CC [M]  drivers/accel/amdxdna/amdxdna_ctx.o
   CC [M]  drivers/accel/amdxdna/amdxdna_gem.o
   CC [M]  drivers/accel/amdxdna/amdxdna_mailbox.o
   CC [M]  drivers/accel/amdxdna/amdxdna_mailbox_helper.o
   CC [M]  drivers/accel/amdxdna/amdxdna_pci_drv.o
   CC [M]  drivers/accel/amdxdna/amdxdna_sysfs.o
   CC [M]  drivers/accel/amdxdna/npu1_regs.o
   CC [M]  drivers/accel/amdxdna/npu2_regs.o
   CC [M]  drivers/accel/amdxdna/npu4_regs.o
   CC [M]  drivers/accel/amdxdna/npu5_regs.o
   AR      drivers/base/firmware_loader/built-in.a
   AR      drivers/base/built-in.a
In file included from drivers/accel/amdxdna/aie2_message.c:19:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration 
of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
   112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
       |                ^~~~~~~~~
In file included from drivers/accel/amdxdna/amdxdna_gem.c:15:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration 
of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
   112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
       |                ^~~~~~~~~
In file included from drivers/accel/amdxdna/aie2_psp.c:11:
drivers/accel/amdxdna/aie2_psp.c: In function ‘psp_exec’:
drivers/accel/amdxdna/aie2_psp.c:62:34: error: implicit declaration of 
function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
    62 |                                  FIELD_GET(PSP_STATUS_READY, 
ready),
       |                                  ^~~~~~~~~
./include/linux/iopoll.h:47:21: note: in definition of macro 
‘read_poll_timeout’
    47 |                 if (cond) \
       |                     ^~~~
drivers/accel/amdxdna/aie2_psp.c:61:15: note: in expansion of macro 
‘readx_poll_timeout’
    61 |         ret = readx_poll_timeout(readl, PSP_REG(psp, 
PSP_STATUS_REG), ready,
       |               ^~~~~~~~~~~~~~~~~~
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration 
of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
   121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
       |                        ^~~~~~~~~~
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration 
of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
   121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
       |                        ^~~~~~~~~~
In file included from drivers/accel/amdxdna/aie2_pci.c:22:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration 
of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
   112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
       |                ^~~~~~~~~
In file included from drivers/accel/amdxdna/aie2_ctx.c:18:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration 
of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
   112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
       |                ^~~~~~~~~
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration 
of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
   121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
       |                        ^~~~~~~~~~
In file included from drivers/accel/amdxdna/amdxdna_ctx.c:16:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration 
of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
   112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
       |                ^~~~~~~~~
cc1: all warnings being treated as errors
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration 
of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
   121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
       |                        ^~~~~~~~~~
drivers/accel/amdxdna/aie2_ctx.c: In function ‘aie2_hwctx_restart’:
drivers/accel/amdxdna/aie2_ctx.c:114:9: error: too few arguments to 
function ‘drm_sched_start’
   114 |         drm_sched_start(&hwctx->priv->sched);
       |         ^~~~~~~~~~~~~~~
In file included from ./include/trace/events/amdxdna.h:12,
                  from drivers/accel/amdxdna/aie2_ctx.c:13:
./include/drm/gpu_scheduler.h:593:6: note: declared here
   593 | void drm_sched_start(struct drm_gpu_scheduler *sched, int errno);
       |      ^~~~~~~~~~~~~~~
make[5]: *** [scripts/Makefile.build:229: 
drivers/accel/amdxdna/aie2_psp.o] Error 1
make[5]: *** Waiting for unfinished jobs....
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration 
of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
   121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
       |                        ^~~~~~~~~~
In file included from drivers/accel/amdxdna/amdxdna_pci_drv.c:18:
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_get_op’:
drivers/accel/amdxdna/amdxdna_ctx.h:112:16: error: implicit declaration 
of function ‘FIELD_GET’ [-Werror=implicit-function-declaration]
   112 |         return FIELD_GET(AMDXDNA_CMD_OPCODE, cmd->header);
       |                ^~~~~~~~~
cc1: all warnings being treated as errors
make[5]: *** [scripts/Makefile.build:229: 
drivers/accel/amdxdna/aie2_ctx.o] Error 1
drivers/accel/amdxdna/amdxdna_ctx.h: In function ‘amdxdna_cmd_set_state’:
drivers/accel/amdxdna/amdxdna_ctx.h:121:24: error: implicit declaration 
of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
   121 |         cmd->header |= FIELD_PREP(AMDXDNA_CMD_STATE, s);
       |                        ^~~~~~~~~~
drivers/accel/amdxdna/amdxdna_mailbox.c: In function 
‘xdna_mailbox_send_msg’:
drivers/accel/amdxdna/amdxdna_mailbox.c:444:26: error: implicit 
declaration of function ‘FIELD_PREP’ [-Werror=implicit-function-declaration]
   444 |         header->sz_ver = FIELD_PREP(MSG_BODY_SZ, msg->send_size) |
       |                          ^~~~~~~~~~


You also have the following checkpatch issues -


WARNING: 'Disalbe' may be misspelled - perhaps 'Disable'?
#1646: FILE: drivers/accel/amdxdna/amdxdna_mailbox.c:553:
+       /* Disalbe an irq and wait. This might sleep. */
            ^^^^^^^

WARNING: 'splite' may be misspelled - perhaps 'split'?
#1695: FILE: drivers/accel/amdxdna/amdxdna_mailbox.h:21:
+ * The mailbox will splite the sending data in to multiple firmware 
message if
                      ^^^^^^

WARNING: 'miliseconds' may be misspelled - perhaps 'milliseconds'?
#1875: FILE: drivers/accel/amdxdna/amdxdna_mailbox_helper.h:9:
+#define TX_TIMEOUT 2000 /* miliseconds */
                             ^^^^^^^^^^^

WARNING: 'miliseconds' may be misspelled - perhaps 'milliseconds'?
#1876: FILE: drivers/accel/amdxdna/amdxdna_mailbox_helper.h:10:
+#define RX_TIMEOUT 5000 /* miliseconds */
                             ^^^^^^^^^^^

total: 0 errors, 4 warnings, 0 checks, 1916 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
       mechanically convert to the typical style using --fix or 
--fix-inplace.

0003-accel-amdxdna-Support-hardware-mailbox.patch has style problems, 
please review.



0007-accel-amdxdna-Add-command-execution.patch
----------------------------------------------
WARNING: 'miliseconds' may be misspelled - perhaps 'milliseconds'?
#59: FILE: drivers/accel/amdxdna/aie2_ctx.c:27:
+#define HWCTX_MAX_TIMEOUT      60000 /* miliseconds */
                                          ^^^^^^^^^^^

WARNING: 'reverve' may be misspelled - perhaps 'reserve'?
#612: FILE: drivers/accel/amdxdna/aie2_ctx.c:779:
+               XDNA_WARN(xdna, "Failed to reverve fence, ret %d", ret);
                                            ^^^^^^^

WARNING: 'Exectuion' may be misspelled - perhaps 'Execution'?
#1899: FILE: drivers/accel/amdxdna/amdxdna_pci_drv.c:139:
+       /* Exectuion */
            ^^^^^^^^^

WARNING: 'exectuion' may be misspelled - perhaps 'execution'?
#2113: FILE: include/uapi/drm/amdxdna_accel.h:239:
+ * struct amdxdna_drm_wait_cmd - Wait exectuion command.
                                        ^^^^^^^^^

total: 0 errors, 10 warnings, 0 checks, 1983 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
       mechanically convert to the typical style using --fix or 
--fix-inplace.

0007-accel-amdxdna-Add-command-execution.patch has style problems, 
please review.


0008-accel-amdxdna-Add-suspend-and-resume.patch
-----------------------------------------------
WARNING: 'miliseconds' may be misspelled - perhaps 'milliseconds'?
#163: FILE: drivers/accel/amdxdna/amdxdna_pci_drv.c:22:
+#define AMDXDNA_AUTOSUSPEND_DELAY      5000 /* miliseconds */
                                                 ^^^^^^^^^^^

total: 0 errors, 1 warnings, 0 checks, 302 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
       mechanically convert to the typical style using --fix or 
--fix-inplace.

0008-accel-amdxdna-Add-suspend-and-resume.patch has style problems, 
please review.


-Jeff

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ