lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFCwf12XZRxLYifSfuB+RGhuiKBytzsUTOnEa6FqfJHYvcVJPQ@mail.gmail.com>
Date:   Tue, 15 Sep 2020 23:46:58 +0300
From:   Oded Gabbay <oded.gabbay@...il.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     "Linux-Kernel@...r. Kernel. Org" <linux-kernel@...r.kernel.org>,
        netdev@...r.kernel.org, SW_Drivers <SW_Drivers@...ana.ai>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "David S. Miller" <davem@...emloft.net>,
        Andrew Lunn <andrew@...n.ch>,
        Florian Fainelli <f.fainelli@...il.com>,
        linux-rdma@...r.kernel.org
Subject: Re: [PATCH v3 00/14] Adding GAUDI NIC code to habanalabs driver

On Tue, Sep 15, 2020 at 11:35 PM Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Tue, 15 Sep 2020 20:10:08 +0300 Oded Gabbay wrote:
> > Hello,
> >
> > This is the second version of the patch-set to upstream the GAUDI NIC code
> > into the habanalabs driver.
> >
> > The only modification from v2 is in the ethtool patch (patch 12). Details
> > are in that patch's commit message.
>
> You keep reposting this, yet this SDK shim^W^W driver is still living in
> drivers/misc. If you want to make it look like a NIC, the code belongs
> where NIC drivers are.
>
> Then again, is it a NIC? Why do you have those custom IOCTLs? That's far
> from normal.

Hi Jakub,
I'm sorry but from your question it seems as if you didn't read my
cover letter at all, as I took great lengths in explaining exactly
what our device is and why we use custom IOCTLs.
TL;DR
We have an accelerator for deep learning (GAUDI) which uses RDMA as
infrastructure for communication between multiple accelerators. Same
as Nvidia uses NVlink, we use RDMA that we have inside our ASIC.
The RDMA implementation we did does NOT support some basic RDMA
IBverbs (such as MR and PD) and therefore, we can't use the rdma-core
library or to connect to the rdma infrastructure in the kernel. We
wanted to do it but when we analyzed it, we saw we wouldn't be able to
support basic stuff and therefore we had to revert to our IOCTLs.
To sum it up, because our NIC is used for intra-communication, we
don't expose nor intend users to use it as a NIC per-se. However, to
be able to get statistics and manage them in a standard way, and
support control plane over Ethernet, we do register each port to the
net subsystem (i.e. create netdev per port).

I hope this short summary explains this better.
As per your request that this code lives in the net subsystem, I think
that will make it only more complicated and hard to upstream and
maintain.
I see there are other examples (e.g. sgi-xp) that contain networking
driver code in misc so I don't understand this objection.
>
> Please make sure to CC linux-rdma. You clearly stated that the device
> does RDMA-like transfers.

We don't use the RDMA infrastructure in the kernel and we can't
connect to it due to the lack of H/W support we have so I don't see
why we need to CC linux-rdma.

Oded

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ