netdev - Re: [PATCH AUTOSEL 4.9 09/26] net/mlx5e: Init ethtool steering for representors

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200416195859.GP1068@sasha-vm>
Date:   Thu, 16 Apr 2020 15:58:59 -0400
From:   Sasha Levin <sashal@...nel.org>
To:     Saeed Mahameed <saeedm@...lanox.com>
Cc:     "leon@...nel.org" <leon@...nel.org>,
        "ecree@...arflare.com" <ecree@...arflare.com>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "stable@...r.kernel.org" <stable@...r.kernel.org>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "gerlitz.or@...il.com" <gerlitz.or@...il.com>,
        "davem@...emloft.net" <davem@...emloft.net>
Subject: Re: [PATCH AUTOSEL 4.9 09/26] net/mlx5e: Init ethtool steering for
 representors

On Thu, Apr 16, 2020 at 07:07:13PM +0000, Saeed Mahameed wrote:
>On Thu, 2020-04-16 at 09:30 -0400, Sasha Levin wrote:
>> On Thu, Apr 16, 2020 at 08:24:09AM +0300, Leon Romanovsky wrote:
>> > On Thu, Apr 16, 2020 at 04:08:10AM +0000, Saeed Mahameed wrote:
>> > > On Wed, 2020-04-15 at 20:00 -0400, Sasha Levin wrote:
>> > > > On Wed, Apr 15, 2020 at 05:18:38PM +0100, Edward Cree wrote:
>> > > > > Firstly, let me apologise: my previous email was too harsh
>> > > > > and too
>> > > > >  assertiveabout things that were really more uncertain and
>> > > > > unclear.
>> > > > >
>> > > > > On 14/04/2020 21:57, Sasha Levin wrote:
>> > > > > > I've pointed out that almost 50% of commits tagged for
>> > > > > > stable do
>> > > > > > not
>> > > > > > have a fixes tag, and yet they are fixes. You really deduce
>> > > > > > things based
>> > > > > > on coin flip probability?
>> > > > > Yes, but far less than 50% of commits *not* tagged for stable
>> > > > > have
>> > > > > a fixes
>> > > > >  tag.  It's not about hard-and-fast Aristotelian
>> > > > > "deductions", like
>> > > > > "this
>> > > > >  doesn't have Fixes:, therefore it is not a stable
>> > > > > candidate", it's
>> > > > > about
>> > > > >  probabilistic "induction".
>> > > > >
>> > > > > > "it does increase the amount of countervailing evidence
>> > > > > > needed to
>> > > > > > conclude a commit is a fix" - Please explain this argument
>> > > > > > given
>> > > > > > the
>> > > > > > above.
>> > > > > Are you familiar with Bayesian statistics?  If not, I'd
>> > > > > suggest
>> > > > > reading
>> > > > >  something like http://yudkowsky.net/rational/bayes/ which
>> > > > > explains
>> > > > > it.
>> > > > > There's a big difference between a coin flip and a
>> > > > > _correlated_
>> > > > > coin flip.
>> > > >
>> > > > I'd maybe point out that the selection process is based on a
>> > > > neural
>> > > > network which knows about the existence of a Fixes tag in a
>> > > > commit.
>> > > >
>> > > > It does exactly what you're describing, but also taking a bunch
>> > > > more
>> > > > factors into it's desicion process ("panic"? "oops"?
>> > > > "overflow"?
>> > > > etc).
>> > > >
>> > >
>> > > I am not against AUTOSEL in general, as long as the decision to
>> > > know
>> > > how far back it is allowed to take a patch is made
>> > > deterministically
>> > > and not statistically based on some AI hunch.
>> > >
>> > > Any auto selection for a patch without a Fixes tags can be
>> > > catastrophic
>> > > .. imagine a patch without a Fixes Tag with a single line that is
>> > > fixing some "oops", such patch can be easily applied cleanly to
>> > > stable-
>> > > v.x and stable-v.y .. while it fixes the issue on v.x it might
>> > > have
>> > > catastrophic results on v.y ..
>> >
>> > I tried to imagine such flow and failed to do so. Are you talking
>> > about
>> > anything specific or imaginary case?
>>
>> It happens, rarely, but it does. However, all the cases I can think
>> of
>> happened with a stable tagged commit without a fixes where it's
>> backport
>> to an older tree caused unintended behavior (local denial of service
>> in
>> one case).
>>
>> The scenario you have in mind is true for both stable and non-stable
>> tagged patches, so it you want to restrict how we deal with commits
>> that
>> don't have a fixes tag shouldn't it be true for *all* commits?
>
>All commits? even the ones without "oops" in them ? where does this
>stop ? :)
>We _must_ have a hard and deterministic cut for how far back to take a
>patch based on a human decision.. unless we are 100% positive
>autoselection AI can never make a mistake.
>
>Humans are allowed to make mistakes, AI is not.

Oh I'm reviewing all patches myself after the bot does it's selection,
you can blame me for these screw ups.

>If a Fixes tag is wrong, then a human will be blamed, and that is
>perfectly fine, but if we have some statistical model that we know it
>is going to be wrong 0.001% of the time.. and we still let it run..
>then something needs to be done about this.
>
>I know there are benefits to autosel, but overtime, if this is not
>being audited, many pieces of the kernel will get broken unnoticed
>until some poor distro decides to upgrade their kernel version.

Quite a few distros are always running on the latest LTS releases,
Android isn't that far behind either at this point.

There are actually very few non-LTS users at this point...

>>
>> > <...>
>> > > > Let me put my Microsoft employee hat on here. We have
>> > > > driver/net/hyperv/
>> > > > which definitely wasn't getting all the fixes it should have
>> > > > been
>> > > > getting without AUTOSEL.
>> > > >
>> > >
>> > > until some patch which shouldn't get backported slips through,
>> > > believe
>> > > me this will happen, just give it some time ..
>> >
>> > Bugs are inevitable, I don't see many differences between bugs
>> > introduced by manually cherry-picking or automatically one.
>>
>> Oh bugs slip in, that's why I track how many bugs slipped via stable
>> tagged commits vs non-stable tagged ones, and the statistic may
>> surprise
>> you.
>>
>
>Statistics do not matter here, what really matters is that there is a
>possibility of a non-human induced error, this should be a no no.
>or at least make it an opt-in thing for those who want to take their
>chances and keep a close eye on it..

Hrm, why? Pretend that the bot is a human sitting somewhere sending
mails out, how does it change anything?

>> The solution here is to beef up your testing infrastructure rather
>> than
>
>So please let me opt-in until I beef up my testing infra.

Already did :)

>> taking less patches; we still want to have *all* the fixes, right?
>>
>
>if you can be sure 100% it is the right thing to do, then yes, please
>don't hesitate to take that patch, even without asking anyone !!
>
>Again, Humans are allowed to make mistakes.. AI is not.

Again, why?

-- 
Thanks,
Sasha