lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0UcLWEP5GOqFEDeyGFpJre+g2_AbmBOSXJsoXZuCprGH0Q@mail.gmail.com>
Date: Mon, 8 Apr 2024 13:43:28 -0700
From: Alexander Duyck <alexander.duyck@...il.com>
To: Leon Romanovsky <leon@...nel.org>
Cc: Jakub Kicinski <kuba@...nel.org>, John Fastabend <john.fastabend@...il.com>, 
	Jiri Pirko <jiri@...nulli.us>, netdev@...r.kernel.org, bhelgaas@...gle.com, 
	linux-pci@...r.kernel.org, Alexander Duyck <alexanderduyck@...com>, davem@...emloft.net, 
	pabeni@...hat.com
Subject: Re: [net-next PATCH 00/15] eth: fbnic: Add network driver for Meta
 Platforms Host Network Interface

On Mon, Apr 8, 2024 at 11:41 AM Leon Romanovsky <leon@...nel.org> wrote:
>
> On Mon, Apr 08, 2024 at 08:26:33AM -0700, Alexander Duyck wrote:
> > On Sun, Apr 7, 2024 at 11:18 PM Leon Romanovsky <leon@...nel.org> wrote:
> > >
> > > On Fri, Apr 05, 2024 at 08:41:11AM -0700, Alexander Duyck wrote:
> > > > On Thu, Apr 4, 2024 at 7:38 PM Jakub Kicinski <kuba@...nel.org> wrote:
> > >
> > > <...>
> > >
> > > > > > > Technical solution? Maybe if it's not a public device regression rules
> > > > > > > don't apply? Seems fairly reasonable.
> > > > > >
> > > > > > This is a hypothetical. This driver currently isn't changing anything
> > > > > > outside of itself. At this point the driver would only be build tested
> > > > > > by everyone else. They could just not include it in their Kconfig and
> > > > > > then out-of-sight, out-of-mind.
> > > > >
> > > > > Not changing does not mean not depending on existing behavior.
> > > > > Investigating and fixing properly even the hardest regressions in
> > > > > the stack is a bar that Meta can so easily clear. I don't understand
> > > > > why you are arguing.
> > > >
> > > > I wasn't saying the driver wouldn't be dependent on existing behavior.
> > > > I was saying that it was a hypothetical that Meta would be a "less
> > > > than cooperative user" and demand a revert.  It is also a hypothetical
> > > > that Linus wouldn't just propose a revert of the fbnic driver instead
> > > > of the API for the crime of being a "less than cooperative maintainer"
> > > > and  then give Meta the Nvidia treatment.
> > >
> > > It is very easy to be "less than cooperative maintainer" in netdev world.
> > > 1. Be vendor.
> > > 2. Propose ideas which are different.
> > > 3. Report for user-visible regression.
> > > 4. Ask for a fix from the patch author or demand a revert according to netdev rules/practice.
> > >
> > > And voilà, you are "less than cooperative maintainer".
> > >
> > > So in reality, the "hypothetical" is very close to the reality, unless
> > > Meta contribution will be treated as a special case.
> > >
> > > Thanks
> >
> > How many cases of that have we had in the past? I'm honestly curious
> > as I don't actually have any reference.
>
> And this is the problem, you don't have "any reference" and accurate
> knowledge what happened, but you are saying "less than cooperative
> maintainer".

By "less than cooperative maintainer" I was referring to the scenario
where somebody is maintaining something unique to them, such as the
Meta Host NIC, and not willing to work with the community to fix it
and instead just demanding a revert of a change. It doesn't seem like
it would be too much to ask to work with the author on a fix for the
problem as long as the maintainer is willing to work with the author
on putting together and testing the fix.

With that said if the upstream version of things aren't broken then it
doesn't matter. It shouldn't be expected of the community to maintain
any proprietary code that wasn't accepted upstream.

> >
> > Also as far as item 3 isn't hard for it to be a "user-visible"
> > regression if there are no users outside of the vendor that is
> > maintaining the driver to report it?
>
> This wasn't the case. It was change in core code, which broke specific
> version of vagrant. Vendor caught it simply by luck.

Any more info on this? Without context it is hard to say one way or the other.

I know I have seen my fair share of hot issues such as when the
introduction of the tracing framework was corrupting the NVRAM on
e1000e NICs.[1] It got everyone's attention when it essentially
bricked one of Linus's systems. I don't recall us doing a full revert
on function tracing as a result, but I believe it was flagged as
broken until it could be resolved. So depending on the situation there
are cases where asking for a fix or revert might be appropriate.

> > Again I am assuming that the same rules wouldn't necessarily apply
> > in the vendor/consumer being one entity case.
> >
> > Also from my past experience the community doesn't give a damn about
> > 1. It is only if 3 is being reported by actual users that somebody
> > would care. The fact is if vendors held that much power they would
> > have run roughshod over the community long ago as I know there are
> > vendors who love to provide one-off projects outside of the kernel and
> > usually have to work to get things into the upstream later and no
> > amount of complaining about "the users" will get their code accepted.
> > The users may complain but it is the vendors fault for that so the
> > community doesn't have to take action.
>
> You are taking it to completely wrong direction with your assumptions.
> The reality is that regression was reported by real user without any
> vendor code involved. This is why the end result was so bad for all parties.

Okay, but that doesn't tie into what is going on here. In this case
"vendor" == "user". Like I was saying the community generally cares
about the user so 3 would be the important case assuming they are
using a stock kernel and driver and not hiding behind the vendor
expecting some sort of proprietary fix. If they are using some
proprietary stuff behind the scenes, then tough luck.

> So no, you can get "less than cooperative maintainer" label really easy in
> current environment.

I didn't say you couldn't. Without context I cannot say if it was
deserved or not. I know in the example I cited above Intel had to add
changes to the e1000e driver to make the NVRAM non-writable until the
problem patch was found. So Intel was having to patch to fix an issue
it didn't introduce and deal with the negative press and blow-back
from a function tracing patch that was damaging NICs.

The point I was trying to make is that if you are the only owner of
something, and not willing to work with the community as a maintainer
it becomes much easier for the community to just revert the driver
rather than try to change the code if you aren't willing to work with
them. Thus the "less than cooperative" part. The argument being made
seems to be that once something is in the kernel it is forever and if
we get it in and then refuse to work with the community it couldn't be
reverted. I am arguing that isn't the case, especially if Meta were to
become a "less than cooperative maintainer" for a device that is
primarily only going to be available in Meta data centers.

Thanks,

- Alex

[1]: https://lwn.net/Articles/304105/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ