lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <SH0PR01MB063461EBC046437C88A6AE84983BA@SH0PR01MB0634.CHNPR01.prod.partner.outlook.cn>
Date: Thu, 28 Mar 2024 07:46:01 +0000
From: Dejia Shang <Dejia.Shang@...china.com>
To: "ogabbay@...nel.org" <ogabbay@...nel.org>, "airlied@...hat.com"
	<airlied@...hat.com>, "daniel@...ll.ch" <daniel@...ll.ch>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
	"linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>
Subject: About upstreaming ArmChina NPU driver

Dear Kernel Maintainers,

I am a driver developer and would like to upstream the ArmChina Zhouyi NPU driver ("Zhouyi" is the brand) to accel subsystem.

The driver is already open sourced (both UMD and KMD) and anyone can find the code from https://github.com/Arm-China/Compass_NPU_Driver.git.

This driver is responsible for scheduling AI inference tasks to the NPU cores (V1/V2/V3). Specifically, a simplified end-to-end flow is:

        1. A TFLite/ONNX model is transformed to an executable binary file in ELF format by the NN graph compiler (designed by ArmChina)
        2. An application loads the executable binary file to UMD and provides the input data.
        3. UMD parses the binary and sends ioctls to KMD (open device, do memory allocation/mmap/free, submit the job descriptor).
        4. KMD dispatches the job to NPU h/w, handles interrupts and updates the execution status.
        5. UMD polls the status of the pre-scheduled job.
        6. The application gets the output results.

So...for the upstreaming,

Q1: do you think our NPU driver is suitable for accel? If the answer is yes, which tree & branch should the patches be based on?

Q2: in thread https://lore.kernel.org/lkml/ec547d33-214f-4952-aa33-c271e9edad63@kernel.org/ showing a similar case, Oded mentioned that:

        "If we would have upstreamed a new driver, the expectation would have been that we would use some drm mechanisms.", and
        "the minimal requirement is to use GEM/BOs for memory management operations".

I guess those requirements are also applicable for the Zhouyi NPU KMD? Currently, the memory management (MM) in KMD is based on dma-mapping APIs, which handles both reserved CMA region(s) and SMMU mapped buffers, and supports the dma-buf framework. Maybe I should replace the implementations with DRM APIs.

Q3: if you have looked at the KMD code, do you think I should make any other major change before submitting the first patch series? Thank you!

Thanks for your time and look forward to your reply~ 😊

Best Regards,
Dejia
IMPORTANT NOTICE: The contents of this email and any attachments may be privileged and confidential. If you are not the intended recipient, please delete the email immediately. It is strictly prohibited to disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ©Arm Technology (China) Co., Ltd copyright and reserve all rights. 重要提示:本邮件(包括任何附件)可能含有专供明确的个人或目的使用的机密信息,并受法律保护。如果您并非该收件人,请立即删除此邮件。严禁通过任何渠道,以任何目的,向任何人披露、储存或复制邮件信息或者据此采取任何行动。感谢您的配合。 ©安谋科技(中国)有限公司 版权所有并保留一切权利。

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ