[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240216150546.GD13330@nvidia.com>
Date: Fri, 16 Feb 2024 11:05:46 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Christoph Hellwig <hch@...radead.org>,
Saeed Mahameed <saeed@...nel.org>, Arnd Bergmann <arnd@...db.de>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Leon Romanovsky <leonro@...dia.com>, Jiri Pirko <jiri@...dia.com>,
Leonid Bloch <lbloch@...dia.com>, Itay Avraham <itayavr@...dia.com>,
Saeed Mahameed <saeedm@...dia.com>,
David Ahern <dsahern@...nel.org>,
Aron Silverton <aron.silverton@...cle.com>,
andrew.gospodarek@...adcom.com, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org
Subject: Re: [PATCH V4 0/5] mlx5 ConnectX control misc driver
On Thu, Feb 15, 2024 at 05:00:46PM -0800, Jakub Kicinski wrote:
> But this is a bit of a vicious cycle, vendors have little incentive
> to interoperate, and primarily focus on adding secret sauce outside of
> the standard. In fact you're lucky if the vendor didn't bake some
> extension which requires custom switches into the NICs :(
This may all seem shocking if you come from the netdev world, but this
has been normal for HPC networking for the last 30 years at least.
My counter perspective would be that we are currently in a pretty good
moment for HPC industry because we actually have open source
implementations for most of it. In fact most actual deployments are
running something quite close to the mainline open source stack.
The main hold out right now is Cray/HPE's Slingshot networking family
(based on ethernet apparently), but less open source.
I would say the HPC community has a very different community goal post
that netdev land. Make your thing, whatever it is. Come with an open
kernel driver, a open rdma-core, a open libfabric/ucx and plug into
the open dpdk/nccl/ucx/libfabric layer and demonstrate your thing
works with openmpi/etc applications.
Supporting that open stack is broadly my north star for the kernel
perspective as Mesa is to DRM.
Several points of this chain are open industry standards driven by
technical working group communities.
This is what the standardization and interoperability looks like
here. It is probably totally foreign from a netdev view point, far
less focus on the wire protocol, devices and kernel. Here the focus is
on application and software interoperability. Still, it is open in
a pretty solid way.
Jason
Powered by blists - more mailing lists