netdev - Re: [PATCH AUTOSEL 4.9 09/26] net/mlx5e: Init ethtool steering for representors

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3226e1df60666c0c4e3256ec069fee2d814d9a03.camel@mellanox.com>
Date:   Thu, 16 Apr 2020 21:08:06 +0000
From:   Saeed Mahameed <saeedm@...lanox.com>
To:     "sashal@...nel.org" <sashal@...nel.org>
CC:     "ecree@...arflare.com" <ecree@...arflare.com>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "gerlitz.or@...il.com" <gerlitz.or@...il.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "stable@...r.kernel.org" <stable@...r.kernel.org>,
        "leon@...nel.org" <leon@...nel.org>
Subject: Re: [PATCH AUTOSEL 4.9 09/26] net/mlx5e: Init ethtool steering for
 representors

On Thu, 2020-04-16 at 15:58 -0400, Sasha Levin wrote:
> On Thu, Apr 16, 2020 at 07:07:13PM +0000, Saeed Mahameed wrote:
> > On Thu, 2020-04-16 at 09:30 -0400, Sasha Levin wrote:
> > > On Thu, Apr 16, 2020 at 08:24:09AM +0300, Leon Romanovsky wrote:
> > > > On Thu, Apr 16, 2020 at 04:08:10AM +0000, Saeed Mahameed wrote:
> > > > > On Wed, 2020-04-15 at 20:00 -0400, Sasha Levin wrote:
> > > > > > On Wed, Apr 15, 2020 at 05:18:38PM +0100, Edward Cree
> > > > > > wrote:
> > > > > > > Firstly, let me apologise: my previous email was too
> > > > > > > harsh
> > > > > > > and too
> > > > > > >  assertiveabout things that were really more uncertain
> > > > > > > and
> > > > > > > unclear.
> > > > > > > 
> > > > > > > On 14/04/2020 21:57, Sasha Levin wrote:
> > > > > > > > I've pointed out that almost 50% of commits tagged for
> > > > > > > > stable do
> > > > > > > > not
> > > > > > > > have a fixes tag, and yet they are fixes. You really
> > > > > > > > deduce
> > > > > > > > things based
> > > > > > > > on coin flip probability?
> > > > > > > Yes, but far less than 50% of commits *not* tagged for
> > > > > > > stable
> > > > > > > have
> > > > > > > a fixes
> > > > > > >  tag.  It's not about hard-and-fast Aristotelian
> > > > > > > "deductions", like
> > > > > > > "this
> > > > > > >  doesn't have Fixes:, therefore it is not a stable
> > > > > > > candidate", it's
> > > > > > > about
> > > > > > >  probabilistic "induction".
> > > > > > > 
> > > > > > > > "it does increase the amount of countervailing evidence
> > > > > > > > needed to
> > > > > > > > conclude a commit is a fix" - Please explain this
> > > > > > > > argument
> > > > > > > > given
> > > > > > > > the
> > > > > > > > above.
> > > > > > > Are you familiar with Bayesian statistics?  If not, I'd
> > > > > > > suggest
> > > > > > > reading
> > > > > > >  something like http://yudkowsky.net/rational/bayes/
> > > > > > > which
> > > > > > > explains
> > > > > > > it.
> > > > > > > There's a big difference between a coin flip and a
> > > > > > > _correlated_
> > > > > > > coin flip.
> > > > > > 
> > > > > > I'd maybe point out that the selection process is based on
> > > > > > a
> > > > > > neural
> > > > > > network which knows about the existence of a Fixes tag in a
> > > > > > commit.
> > > > > > 
> > > > > > It does exactly what you're describing, but also taking a
> > > > > > bunch
> > > > > > more
> > > > > > factors into it's desicion process ("panic"? "oops"?
> > > > > > "overflow"?
> > > > > > etc).
> > > > > > 
> > > > > 
> > > > > I am not against AUTOSEL in general, as long as the decision
> > > > > to
> > > > > know
> > > > > how far back it is allowed to take a patch is made
> > > > > deterministically
> > > > > and not statistically based on some AI hunch.
> > > > > 
> > > > > Any auto selection for a patch without a Fixes tags can be
> > > > > catastrophic
> > > > > .. imagine a patch without a Fixes Tag with a single line
> > > > > that is
> > > > > fixing some "oops", such patch can be easily applied cleanly
> > > > > to
> > > > > stable-
> > > > > v.x and stable-v.y .. while it fixes the issue on v.x it
> > > > > might
> > > > > have
> > > > > catastrophic results on v.y ..
> > > > 
> > > > I tried to imagine such flow and failed to do so. Are you
> > > > talking
> > > > about
> > > > anything specific or imaginary case?
> > > 
> > > It happens, rarely, but it does. However, all the cases I can
> > > think
> > > of
> > > happened with a stable tagged commit without a fixes where it's
> > > backport
> > > to an older tree caused unintended behavior (local denial of
> > > service
> > > in
> > > one case).
> > > 
> > > The scenario you have in mind is true for both stable and non-
> > > stable
> > > tagged patches, so it you want to restrict how we deal with
> > > commits
> > > that
> > > don't have a fixes tag shouldn't it be true for *all* commits?
> > 
> > All commits? even the ones without "oops" in them ? where does this
> > stop ? :)
> > We _must_ have a hard and deterministic cut for how far back to
> > take a
> > patch based on a human decision.. unless we are 100% positive
> > autoselection AI can never make a mistake.
> > 
> > Humans are allowed to make mistakes, AI is not.
> 
> Oh I'm reviewing all patches myself after the bot does it's
> selection,
> you can blame me for these screw ups.
> 
> > If a Fixes tag is wrong, then a human will be blamed, and that is
> > perfectly fine, but if we have some statistical model that we know
> > it
> > is going to be wrong 0.001% of the time.. and we still let it run..
> > then something needs to be done about this.
> > 
> > I know there are benefits to autosel, but overtime, if this is not
> > being audited, many pieces of the kernel will get broken unnoticed
> > until some poor distro decides to upgrade their kernel version.
> 
> Quite a few distros are always running on the latest LTS releases,
> Android isn't that far behind either at this point.
> 
> There are actually very few non-LTS users at this point...
> 
> > > > <...>
> > > > > > Let me put my Microsoft employee hat on here. We have
> > > > > > driver/net/hyperv/
> > > > > > which definitely wasn't getting all the fixes it should
> > > > > > have
> > > > > > been
> > > > > > getting without AUTOSEL.
> > > > > > 
> > > > > 
> > > > > until some patch which shouldn't get backported slips
> > > > > through,
> > > > > believe
> > > > > me this will happen, just give it some time ..
> > > > 
> > > > Bugs are inevitable, I don't see many differences between bugs
> > > > introduced by manually cherry-picking or automatically one.
> > > 
> > > Oh bugs slip in, that's why I track how many bugs slipped via
> > > stable
> > > tagged commits vs non-stable tagged ones, and the statistic may
> > > surprise
> > > you.
> > > 
> > 
> > Statistics do not matter here, what really matters is that there is
> > a
> > possibility of a non-human induced error, this should be a no no.
> > or at least make it an opt-in thing for those who want to take
> > their
> > chances and keep a close eye on it..
> 
> Hrm, why? Pretend that the bot is a human sitting somewhere sending
> mails out, how does it change anything?
> 

If i know a bot might do something wrong, i Fix it and make sure it
will never do it again. For humans i just can't do that, can I ? :)
so this is the difference and why we all have jobs .. 

> > > The solution here is to beef up your testing infrastructure
> > > rather
> > > than
> > 
> > So please let me opt-in until I beef up my testing infra.
> 
> Already did :)

No you didn't :), I received more than 5 AUTOSEL emails only today and
yesterday.

Please don't opt mlx5 out just yet ;-), i need to do some more research
and make up my mind..

> 
> > > taking less patches; we still want to have *all* the fixes,
> > > right?
> > > 
> > 
> > if you can be sure 100% it is the right thing to do, then yes,
> > please
> > don't hesitate to take that patch, even without asking anyone !!
> > 
> > Again, Humans are allowed to make mistakes.. AI is not.
> 
> Again, why?
> 

Because AI is not there yet.. and this is a very big philosophical
question.

Let me simplify: there is a bug in the AI, where it can choose a wrong
patch, let's fix it.