lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 23 Aug 2022 23:45:24 +0300
From:   Oded Gabbay <oded.gabbay@...il.com>
To:     Kevin Hilman <khilman@...libre.com>
Cc:     Dave Airlie <airlied@...il.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Yuji Ishikawa <yuji2.ishikawa@...hiba.co.jp>,
        Jiho Chu <jiho.chu@...sung.com>,
        Alexandre Bailon <abailon@...libre.com>,
        Jason Gunthorpe <jgg@...dia.com>,
        Arnd Bergmann <arnd@...db.de>,
        dri-devel <dri-devel@...ts.freedesktop.org>,
        "Linux-Kernel@...r. Kernel. Org" <linux-kernel@...r.kernel.org>
Subject: Re: New subsystem for acceleration devices

On Tue, Aug 23, 2022 at 9:24 PM Kevin Hilman <khilman@...libre.com> wrote:
>
> Hi Obed,
>
> Oded Gabbay <oded.gabbay@...il.com> writes:
>
> [...]
>
> > I want to update that I'm currently in discussions with Dave to figure
> > out what's the best way to move forward. We are writing it down to do
> > a proper comparison between the two paths (new accel subsystem or
> > using drm). I guess it will take a week or so.
>
> Any update on the discussions with Dave? and/or are there any plans to
> discuss this further at LPC/ksummit yet?
Hi Kevin.

We are still discussing the details, as at least the habanalabs driver
is very complex and there are multiple parts that I need to see if and
how they can be mapped to drm.
Some of us will attend LPC so we will probably take advantage of that
to talk more about this.

>
> We (BayLibre) are upstreaming support for APUs on Mediatek SoCs, and are
> using the DRM-based approach.  I'll also be at LPC and happy to discuss
> in person.
>
> For some context on my/our interest: back in Sept 2020 we initially
> submitted an rpmesg based driver for kernel communication[1].  After
> review comments, we rewrote that based on DRM[2] and are now using it
> for some MTK SoCs[3] and supporting our MTK customers with it.
>
> Hopefully we will get the kernel interfaces sorted out soon, but next,
> there's the userspace side of things.  To that end, we're also working
> on libAPU, a common, open userspace stack.  Alex Bailon recently
> presented a proposal earlier this year at Embedded Recipes in Paris
> (video[4], slides[5].)
>
> libAPU would include abstractions of the kernel interfaces for DRM
> (using libdrm), remoteproc/rpmsg, virtio etc. but also goes farther and
> proposes an open firmware for the accelerator side using
> libMetal/OpenAMP + rpmsg for communication with (most likely closed
> source) vendor firmware.  Think of this like sound open firmware (SOF[6]),
> but for accelerators.
I think your device and the habana device are very different in
nature, and it is part of what Dave and I discussed, whether these two
classes of devices can live together. I guess they can live together
in the kernel, but in the userspace, not so much imo.

The first class is the edge inference devices (usually as part of some
SoC). I think your description of the APU on MTK SoC is a classic
example of such a device.
You usually have some firmware you load, you give it a graph and
pointers for input and output and then you just execute the graph
again and again to perform inference and just replace the inputs.

The second class is the data-center, training accelerators, which
habana's gaudi device is classified as such. These devices usually
have a number of different compute engines, a fabric for scaling out,
on-device memory, internal MMUs and RAS monitoring requirements. Those
devices are usually operated via command queues, either through their
kernel driver or directly from user-space. They have multiple APIs for
memory management, RAS, scaling-out and command-submissions.

>
> We've been using this succesfully for Mediatek SoCs (which have a
> Cadence VP6 APU) and have submitted/published the code, including the
> OpenAMP[7] and libmetal[8] parts in addition to the kernel parts already
> mentioned.
What's the difference between libmetal and other open-source low-level
runtime drivers, such as oneAPI level-zero ?

Currently we have our own runtime driver which is tightly coupled with
our h/w. For example, the method the userspace "talks" to the
data-plane firmware is very proprietary as it is hard-wired into the
architecture of the entire ASIC and how it performs deep-learning
training. Therefore, I don't see how this can be shared with other
vendors. Not because of secrecy but because it is simply not relevant
to any other ASIC.

>
> We're to the point where we're pretty happy with how this works for MTK
> SoCs, and wanting to collaborate with folks working on other platforms
> and to see what's needed to support other kinds of accelerators with a
> common userspace and open firmware infrastructure.
>
> Kevin
>
> [1] https://lore.kernel.org/r/20200930115350.5272-1-abailon@baylibre.com
> [2] https://lore.kernel.org/r/20210917125945.620097-1-abailon@baylibre.com
> [3] https://lore.kernel.org/r/20210819151340.741565-1-abailon@baylibre.com
> [4] https://www.youtube.com/watch?v=Uj1FZoF8MMw&t=18211s
> [5] https://embedded-recipes.org/2022/wp-content/uploads/2022/06/bailon.pdf
> [6] https://www.sofproject.org/
> [7] https://github.com/BayLibre/open-amp/tree/v2021.10-mtk
> [8] https://github.com/BayLibre/libmetal/tree/v2021.10-mtk

Powered by blists - more mailing lists