[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <A2C3C5F0-D86F-4D0C-8402-822063D2C6D1@fb.com>
Date: Tue, 28 Jul 2020 13:18:48 -0400
From: "Chris Mason" <clm@...com>
To: Greg KH <gregkh@...uxfoundation.org>
CC: Jonathan Lemon <jonathan.lemon@...il.com>,
<netdev@...r.kernel.org>, <kernel-team@...com>
Subject: Re: [RFC PATCH v2 21/21] netgpu/nvidia: add Nvidia plugin for netgpu
On 28 Jul 2020, at 12:31, Greg KH wrote:
> On Mon, Jul 27, 2020 at 03:44:44PM -0700, Jonathan Lemon wrote:
>> From: Jonathan Lemon <bsd@...com>
>>
>> This provides the interface between the netgpu core module and the
>> nvidia kernel driver. This should be built as an external module,
>> pointing to the nvidia build. For example:
>>
>> export NV_PACKAGE_DIR=/w/nvidia/NVIDIA-Linux-x86_64-440.64
>> make -C ${kdir} M=`pwd` O=obj $*
>
> Ok, now you are just trolling us.
>
> Nice job, I shouldn't have read the previous patches.
>
> Please, go get a lawyer to sign-off on this patch, with their
> corporate
> email address on it. That's the only way we could possibly consider
> something like this.
>
> Oh, and we need you to use your corporate email address too, as you
> are
> not putting copyright notices on this code, we will need to know who
> to
> come after in the future.
Jonathan, I think we need to do a better job talking about patches that
are just meant to enable possible users vs patches that we actually hope
the upstream kernel to take. Obviously code that only supports out of
tree drivers isn’t a good fit for the upstream kernel. From the point
of view of experimenting with these patches, GPUs benefit a lot from
this functionality so I think it does make sense to have the enabling
patches somewhere, just not in this series.
We’re finding it more common to have pcie switch hops between a [ GPU,
NIC ] pair and the CPU, which gives a huge advantage to out of tree
drivers or extensions that can DMA directly between the GPU/NIC without
having to copy through the CPU. I’d love to have an alternative built
on TCP because that’s where we invest the vast majority of our tuning,
security and interoperability testing. It’s just more predictable
overall.
This isn’t a new story, but if we can layer on APIs that enable this
cleanly for in-tree drivers, we can work with the vendors to use better
supported APIs and have a more stable kernel. Obviously this is an RFC
and there’s a long road ahead, but as long as the upstream kernel
doesn’t provide an answer, out of tree drivers are going to fill in
the weak spots.
Other possible use cases would include also include other GPUs or my
favorite:
NVME <-> filesystem <-> NIC with io_uring driving the IO and without
copies.
-chris
Powered by blists - more mailing lists