[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0UeBCBfeq5TxTjND6G_S=CWYZsArxQxVb-2paK_smfcn2w@mail.gmail.com>
Date: Fri, 5 Apr 2024 07:24:32 -0700
From: Alexander Duyck <alexander.duyck@...il.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Paolo Abeni <pabeni@...hat.com>, Jakub Kicinski <kuba@...nel.org>,
John Fastabend <john.fastabend@...il.com>, Jiri Pirko <jiri@...nulli.us>, netdev@...r.kernel.org,
bhelgaas@...gle.com, linux-pci@...r.kernel.org,
Alexander Duyck <alexanderduyck@...com>, davem@...emloft.net, Christoph Hellwig <hch@....de>
Subject: Re: [net-next PATCH 00/15] eth: fbnic: Add network driver for Meta
Platforms Host Network Interface
On Fri, Apr 5, 2024 at 5:26 AM Jason Gunthorpe <jgg@...dia.com> wrote:
>
> On Fri, Apr 05, 2024 at 09:11:19AM +0200, Paolo Abeni wrote:
> > On Thu, 2024-04-04 at 17:11 -0700, Alexander Duyck wrote:
> > > Again, I would say we look at the blast radius. That is how we should
> > > be measuring any change. At this point the driver is self contained
> > > into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything
> > > outside that directory, and it can be switched off via Kconfig.
> >
> > I personally think this is the most relevant point. This is just a new
> > NIC driver, completely self-encapsulated. I quickly glanced over the
> > code and it looks like it's not doing anything obviously bad. It really
> > looks like an usual, legit, NIC driver.
>
> This is completely true, and as I've said many times the kernel as a
> project is substantially about supporting the HW that people actually
> build. There is no reason not to merge yet another basic netdev
> driver.
>
> However, there is also a pretty strong red line in Linux where people
> belive, with strong conviction, that kernel code should not be merged
> only to support a propriety userspace. This submission is clearly
> bluring that line. This driver will only run in Meta's proprietary
> kernel fork on servers running Meta's propriety userspace.
>
> At this point perhaps it is OK, a basic NIC driver is not really an
> issue, but Jiri is also very correct to point out that this is heading
> in a very concerning direction.
>
> Alex already indicated new features are coming, changes to the core
> code will be proposed. How should those be evaluated? Hypothetically
> should fbnic be allowed to be the first implementation of something
> invasive like Mina's DMABUF work? Google published an open userspace
> for NCCL that people can (in theory at least) actually run. Meta would
> not be able to do that. I would say that clearly crosses the line and
> should not be accepted.
Why not? Just because we are not commercially selling it doesn't mean
we couldn't look at other solutions such as QEMU. If we were to
provide a github repo with an emulation of the NIC would that be
enough to satisfy the "commercial" requirement?
The fact is I already have an implementation, but I would probably
need to clean up a few things as the current setup requires 3 QEMU
instances to emulate the full setup with host, firmware, and BMC. It
wouldn't be as performant as the actual hardware but it is more than
enough for us to test code with. If we need to look at publishing
something like that to github in order to address the lack of user
availability I could start looking at getting the approvals for that.
> So I think there should be an expectation that technically sound things
> Meta may propose must not be accepted because they cross the
> ideological red line into enabling only proprietary software.
That is a faulty argument. That is like saying we should kick out the
nouveu driver out of Linux just because it supports Nvidia graphics
cards that happen to also have a proprietary out-of-tree driver out
there, or maybe we need to kick all the Intel NIC drivers out for
DPDK? I can't think of many NIC vendors that don't have their own
out-of-tree drivers floating around with their own kernel bypass
solutions to support proprietary software.
> To me it sets up a fairly common anti-pattern where a vendor starts
> out with good intentions, reaches community pushback and falls back to
> their downstream fork. Once forking occurs it becomes self-reinforcing
> as built up infrastructure like tests and CI will only run correctly
> on the fork and the fork grows. Then eventually the upstream code is
> abandoned. This has happened many times before in Linux..
>
> IMHO from a community perspective I feel like we should expect Meta to
> fail and end up with a fork. The community should warn them. However
> if they really want to try anyhow then I'm not sure it would be
> appropriate to stop them at this point. Meta will just end up being a
> "bad vendor".
>
> I think the best thing the netdev community could do is come up with
> some more clear guidelines what Meta could use fbnic to justify and
> what would be rejected (ideologically) and Meta can decide on their
> own if they want to continue.
I agree. We need a consistent set of standards. I just strongly
believe commercial availability shouldn't be one of them.
Powered by blists - more mailing lists