lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Wed, 3 Apr 2024 10:09:53 +0000
From: Dejia Shang <Dejia.Shang@...china.com>
To: Oded Gabbay <oded.gabbay@...il.com>
CC: "ogabbay@...nel.org" <ogabbay@...nel.org>, "airlied@...hat.com"
	<airlied@...hat.com>, "daniel@...ll.ch" <daniel@...ll.ch>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
	"linux-arm-kernel@...ts.infradead.org"
	<linux-arm-kernel@...ts.infradead.org>, Toby Huang <toby.huang@...china.com>,
	Chengkun Sun <CK.Sun@...china.com>
Subject: RE: About upstreaming ArmChina NPU driver


> -----Original Message-----
> From: Oded Gabbay <oded.gabbay@...il.com>
> Sent: 2024年4月3日 14:26
> To: Dejia Shang <Dejia.Shang@...china.com>
> Cc: ogabbay@...nel.org; airlied@...hat.com; daniel@...ll.ch;
> linux-kernel@...r.kernel.org; dri-devel@...ts.freedesktop.org;
> linux-arm-kernel@...ts.infradead.org
> Subject: Re: About upstreaming ArmChina NPU driver
> 
> On Thu, Mar 28, 2024 at 10:01 AM Dejia Shang <Dejia.Shang@...china.com>
> wrote:
> >
> > Dear Kernel Maintainers,
> >
> > I am a driver developer and would like to upstream the ArmChina Zhouyi
> NPU driver ("Zhouyi" is the brand) to accel subsystem.
> >
> > The driver is already open sourced (both UMD and KMD) and anyone can
> find the code from https://github.com/Arm-China/Compass_NPU_Driver.git.
> >
> > This driver is responsible for scheduling AI inference tasks to the NPU cores
> (V1/V2/V3). Specifically, a simplified end-to-end flow is:
> >
> >         1. A TFLite/ONNX model is transformed to an executable binary
> file in ELF format by the NN graph compiler (designed by ArmChina)
> >         2. An application loads the executable binary file to UMD and
> provides the input data.
> >         3. UMD parses the binary and sends ioctls to KMD (open device,
> do memory allocation/mmap/free, submit the job descriptor).
> >         4. KMD dispatches the job to NPU h/w, handles interrupts and
> updates the execution status.
> >         5. UMD polls the status of the pre-scheduled job.
> >         6. The application gets the output results.
> >
> > So...for the upstreaming,
> >
> > Q1: do you think our NPU driver is suitable for accel? If the answer is yes,
> which tree & branch should the patches be based on?
> Hi Dejia,
> Yes, it definitely sounds as a good fit to the accel subsystem.
> Please base your patches on "drm-misc-next" branch in drm-misc repo:
> https://anongit.freedesktop.org/git/drm/drm-misc.git
> 

Hi Oded,
Got it.

> >
> > Q2: in thread
> https://lore.kernel.org/lkml/ec547d33-214f-4952-aa33-c271e9edad63@kern
> el.org/ showing a similar case, Oded mentioned that:
> >
> >         "If we would have upstreamed a new driver, the expectation
> would have been that we would use some drm mechanisms.", and
> >         "the minimal requirement is to use GEM/BOs for memory
> management operations".
> >
> > I guess those requirements are also applicable for the Zhouyi NPU KMD?
> Currently, the memory management (MM) in KMD is based on dma-mapping
> APIs, which handles both reserved CMA region(s) and SMMU mapped buffers,
> and supports the dma-buf framework. Maybe I should replace the
> implementations with DRM APIs.
> Yes, those requirements definitely apply here.
> >
> > Q3: if you have looked at the KMD code, do you think I should make any
> other major change before submitting the first patch series? Thank you!
> I took a quick glance. In general, it seems to be ok, but I noticed two things
> related to the integration with drm/accel:
> 
> 1. You us a scheduler for the job submission, which provides the ability to
> defer jobs. In that case, I suggest to check if you can use drm_sched instead of
> your own implementation. No point in re-inventing the wheel.
> 2. You provide several memory zones for allocation of memory. I would
> suggest here to look at using ttm as the memory manager instead of
> re-implementing your own.

Thanks for your time! I will try to refactor the code as suggested and then send the first patch series.

> 
> And please remove the IMPORTANT NOTICE at the end of your emails. I
> would have to refrain from answering to further emails if that notice remains.

Now fixed. I did not realize that because the server auto appended the notice. Sorry for the inconvenience.

Best Regards,
Dejia

> 
> Thanks,
> Oded
> 
> >
> > Thanks for your time and look forward to your reply~ 😊
> >
> > Best Regards,
> > Dejia

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ