lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <550454b8-2e2c-c947-92c5-37f0367661c2@mailbox.org>
Date:   Mon, 11 Sep 2023 15:30:55 +0200
From:   Michel Dänzer <michel.daenzer@...lbox.org>
To:     Maxime Ripard <mripard@...nel.org>
Cc:     emma@...olt.net, linux-doc@...r.kernel.org,
        vignesh.raman@...labora.com, dri-devel@...ts.freedesktop.org,
        alyssa@...enzweig.io, jbrunet@...libre.com, robdclark@...gle.com,
        corbet@....net, khilman@...libre.com,
        sergi.blanch.torne@...labora.com, david.heidelberg@...labora.com,
        linux-rockchip@...ts.infradead.org,
        Daniel Stone <daniels@...labora.com>,
        martin.blumenstingl@...glemail.com, robclark@...edesktop.org,
        Helen Koike <helen.koike@...labora.com>, anholt@...gle.com,
        linux-mediatek@...ts.infradead.org, matthias.bgg@...il.com,
        linux-amlogic@...ts.infradead.org, gustavo.padovan@...labora.com,
        linux-arm-kernel@...ts.infradead.org,
        angelogioacchino.delregno@...labora.com, neil.armstrong@...aro.org,
        guilherme.gallo@...labora.com, linux-kernel@...r.kernel.org,
        tzimmermann@...e.de
Subject: Re: [PATCH v11] drm: Add initial ci/ subdirectory

On 9/11/23 14:51, Maxime Ripard wrote:
> On Mon, Sep 11, 2023 at 02:13:43PM +0200, Michel Dänzer wrote:
>> On 9/11/23 11:34, Maxime Ripard wrote:
>>> On Thu, Sep 07, 2023 at 01:40:02PM +0200, Daniel Stone wrote:
>>>>
>>>> Secondly, we will never be there. If we could pause for five years and sit
>>>> down making all the current usecases for all the current hardware on the
>>>> current kernel run perfectly, we'd probably get there. But we can't: there's
>>>> new hardware, new userspace, and hundreds of new kernel trees.
>>>
>>> [...]
>>> 
>>> I'm not sure it's actually an argument, really. 10 years ago, we would
>>> never have been at "every GPU on the market has an open-source driver"
>>> here. 5 years ago, we would never have been at this-series-here. That
>>> didn't stop anyone making progress, everyone involved in that thread
>>> included.
>>
>> Even assuming perfection is achievable at all (which is very doubtful,
>> given the experience from the last few years of CI in Mesa and other
>> projects), if you demand perfection before even taking the first step,
>> it will never get off the ground.
> 
> Perfection and scale from the get-go isn't reasonable, yes. Building a
> small, "perfect" (your words, not mine) system that you can later expand
> is doable.

I mean "perfect" as in every single available test runs, is reliable and gates CI. Which seems to be what you're asking for. The only possible expansion of such a system would be adding new 100% reliable tests.

What is being proposed here is an "imperfect" system which takes into account the reality that some tests are not 100% reliable, and can be improved gradually while already preventing some regressions from getting merged.


>>> How are we even supposed to detect those failures in the first
>>> place if tests are flagged as unreliable?
>>
>> Based on experience with Mesa, only a relatively small minority of
>> tests should need to be marked as flaky / not run at all. The majority
>> of tests are reliable and can catch regressions even while some tests
>> are not yet.
> 
> I understand and acknowledge that it worked with Mesa. That's great for
> Mesa. That still doesn't mean that it's the panacea and is for every
> project.

Not sure what you're referring to by panacea, or how it relates to "some tests can be useful even while others aren't yet".


>>> No matter what we do here, what you describe will always happen. Like,
>>> if we do flag those tests as unreliable, what exactly prevents another
>>> issue to come on top undetected, and what will happen when we re-enable
>>> testing?
>>
>> Any issues affecting a test will need to be fixed before (re-)enabling
>> the test for CI.
> 
> If that underlying issue is never fixed, at which point do we consider
> that it's a failure and should never be re-enabled? Who has that role?

Not sure what you're asking. Anybody can (re-)enable a test in CI, they just need to make sure first that it is reliable. Until somebody does that work, it'll stay disabled in CI.


>>> It might or might not be an issue for Linus' release, but I can
>>> definitely see the trouble already for stable releases where fixes will
>>> be backported, but the test state list certainly won't be updated.
>>
>> If the stable branch maintainers want to take advantage of CI for the
>> stable branches, they may need to hunt for corresponding state list
>> commits sometimes. They'll need to take that into account for their
>> decision.
> 
> So we just expect the stable maintainers to track each and every patches
> involved in a test run, make sure that they are in a stable tree, and
> then update the test list? Without having consulted them at all?

I don't expect them to do anything. See the If at the start of what I wrote.


>>>> By keeping those sets of expectations, we've been able to keep Mesa pretty
>>>> clear of regressions, whilst having a very clear set of things that should
>>>> be fixed to point to. It would be great if those set of things were zero,
>>>> but it just isn't. Having that is far better than the two alternatives:
>>>> either not testing at all (obviously bad), or having the test always be red
>>>> so it's always ignored (might as well just not test).
>>>
>>> Isn't that what happens with flaky tests anyway?
>>
>> For a small minority of tests. Daniel was referring to whole test suites.
>>
>>> Even more so since we have 0 context when updating that list.
>>
>> The commit log can provide whatever context is needed.
> 
> Sure, I've yet to see that though.
> 
> There's in 6.6-rc1 around 240 reported flaky tests. None of them have
> any context. That new series hads a few dozens too, without any context
> either. And there's no mention about that being a plan, or a patch
> adding a new policy for all tests going forward.

That does sound bad, would need to be raised in review.


> Any concern I raised were met with a giant "it worked on Mesa" handwave

Lessons learned from years of experience with big real-world CI systems like this are hardly "handwaving".


-- 
Earthling Michel Dänzer            |                  https://redhat.com
Libre software enthusiast          |         Mesa and Xwayland developer

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ