lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 22 Sep 2020 08:41:01 -0300
From:   Jason Gunthorpe <jgg@...pe.ca>
To:     Gal Pressman <galpress@...zon.com>
Cc:     Oded Gabbay <oded.gabbay@...il.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        izur@...ana.ai, Jakub Kicinski <kuba@...nel.org>,
        "Linux-Kernel@...r. Kernel. Org" <linux-kernel@...r.kernel.org>,
        netdev@...r.kernel.org, SW_Drivers <SW_Drivers@...ana.ai>,
        "David S. Miller" <davem@...emloft.net>,
        Andrew Lunn <andrew@...n.ch>,
        Florian Fainelli <f.fainelli@...il.com>,
        linux-rdma@...r.kernel.org, Olof Johansson <olof@...om.net>
Subject: Re: [PATCH v3 00/14] Adding GAUDI NIC code to habanalabs driver

On Mon, Sep 21, 2020 at 02:22:02PM +0300, Gal Pressman wrote:

> What is considered a RoCE port in this case if it's not compliant with RoCE?
> Sounds like it's an implementation of RDMA over ethernet, not RoCE.
> Does GAUDI support UD/RC/.. QPs? Is it using a proprietary wire protocol?
> (BTW, Oded claims it's similar to nvlink, how is nvlink's implementation
> exposed? Or is it closed source?)

I think Oded was drawing a parallel to how nvlink is integral with the
compute element. From Oded's descriptions I don't think it is much
like nvlink at all.

> Jason, how do you imagine GAUDI in the RDMA subsystem? Userspace control path
> verbs (used by hl-thunk?) and all data path verbs exposed as kverbs (used by
> habanalabs driver)?
> So neither any userspace verbs apps could use it nor kernel ULPs?

Based on what Oded described it seems like a reasonable RDMA device
with some limitations around MR IOVA.

Looks like the desire is to create a RDMA WR and CQ ring in userspace,
and then co-mingle that with the compute side of the device.

So instead of doing the special IOCTL and mmap against the compute FD
it would create a RDMA QP and RDMA CQ, use dv to access the raw
internals, and the propritary stack would have exactly the same stuff
it would have had with the misc ioctl.

But, completely separately, they'd also have to implement some of
verbs which serves as the open source userspace showing how this HW
works. What that is depends largely on what their HW can do, and if
they want to connect to UCX/mpi/libfabric/etc

A bunch of ioctl stubs or a few tests is far below our standard in
RDMA.

There may have been some argument that the compute side of this device
has no industry standards so should be a drivers/misc, but HPC
networking *does* have extensive standards and extensive open source
software stacks. It is very hard for me to see how a device in this
market could be competitive without integrating with that stuff.

Jason

Powered by blists - more mailing lists