[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANpmjNMhVcR6TiLv29HqSvVVurUMwtHiokodPyzvwFSeE6UpZw@mail.gmail.com>
Date: Tue, 5 May 2020 15:01:45 +0200
From: Marco Elver <elver@...gle.com>
To: David Gow <davidgow@...gle.com>
Cc: KUnit Development <kunit-dev@...glegroups.com>,
Brendan Higgins <brendanhiggins@...gle.com>,
"Paul E. McKenney" <paulmck@...nel.org>,
Dmitry Vyukov <dvyukov@...gle.com>,
Alexander Potapenko <glider@...gle.com>,
Andrey Konovalov <andreyknvl@...gle.com>,
kasan-dev <kasan-dev@...glegroups.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] kcsan: Add test suite
On Tue, 5 May 2020 at 07:00, David Gow <davidgow@...gle.com> wrote:
>
> On Mon, Apr 27, 2020 at 11:23 PM 'Marco Elver' via kasan-dev
> <kasan-dev@...glegroups.com> wrote:
> >
> > On Mon, 27 Apr 2020 at 16:35, Marco Elver <elver@...gle.com> wrote:
> > >
> > > This adds KCSAN test focusing on behaviour of the integrated runtime.
> > > Tests various race scenarios, and verifies the reports generated to
> > > console. Makes use of KUnit for test organization, and the Torture
> > > framework for test thread control.
> > >
> > > Signed-off-by: Marco Elver <elver@...gle.com>
> > > ---
> >
> > +KUnit devs
> > We had some discussions on how to best test sanitizer runtimes, and we
> > believe that this test is what testing sanitizer runtimes should
> > roughly look like. Note that, for KCSAN there are various additional
> > complexities like multiple threads, and report generation isn't
> > entirely deterministic (need to run some number of iterations to get
> > reports, may get multiple reports, etc.).
>
> Thanks very much for writing the test. I do think that it goes a
> little outside what we'd normally expect of a unit test (notably with
> the issues around determinism and threading), but it's good to see
> KUnit being pushed in new directions a bit.
>
> The biggest issue in my mind is the possibility that the
> non-determinism of the tests could cause false positives. If we're
> trying to run as many KUnit tests as possible as part of continuous
> integration systems or as a condition for accepting patches, having
> flaky tests could be annoying. The KCSAN tests seem to break/fail
> as-is when run on single-core machines (at least, under qemu), so some
> way of documenting this as a requirement would probably be necessary,
> too.
True. Although note that we require CONFIG_KCSAN=y for this test to be
enabled, so I don't think it's a big problem for a regular CI setups.
For a KCSAN setup, I'd expect that we know that running on a
single-core system doesn't yield much interesting results regardless
of tests being run.
The non-deterministic nature of concurrent tests will never entirely
go away, but I think with the right preconditions met (at least N
CPUs, where N depends on PREEMPT_NONE, PREEMPT_VOLUNTARY or PREEMPT)
the tests here should not normally fail.
> One possibility would be to add support for "skipped" tests to KUnit
> (the TAP specification allows for it), so that the KCSAN test could
> detect cases where it's not reliable, and skip itself (leaving a note
> as to why). In the short term, though, we'd absolutely need some
> documentation around the dependencies for the test.
That would be nice. For the time being, I will add a precondition
check to test_init(), and print a warning if the test needs to be
skipped.
> (For the record, the failures I saw were all due to running under qemu
> emulating as a uniprocessor/single-core machine: with
> CONFIG_PREEMPT_VOLUNTARY, it would just hang after creating the first
> couple of threads. With CONFIG_PREEMPT, the tests completed, but the
> majority of them failed.)
Right, let me try to fix those at least. I'll send v2.
(Paul: If you prefer a separate patch rather than v2, let me know.)
> > The main thing, however, is that we want to verify the actual output
> > (or absence of it) to console. This is what the KCSAN test does using
> > the 'console' tracepoint. Could KUnit provide some generic
> > infrastructure to check console output, like is done in the test here?
> > Right now I couldn't say what the most useful generalization of this
> > would be (without it just being a wrapper around the console
> > tracepoint), because the way I've decided to capture and then match
> > console output is quite test-specific. For now we can replicate this
> > logic on a per-test basis, but it would be extremely useful if there
> > was a generic interface that KUnit could provide in future.
>
> This is something we've discussed here a couple of times as well.
> While I'll confess to being a little bit wary of having tests rely too
> heavily on console output: it risks being a bit fragile if the exact
> contents or formatting of messages change, or ends up having a lot of
> string formatting and/or parsing code in the tests. I do agree,
> though, that it probably needs to be at least a part of testing things
> like sanitizers where the ultimate goal is to produce console output.
> I'm not exactly sure how we'd implement it yet, so it's probably not
> going to happen extremely soon, but what you have here looks to me
> like a good example we can generalise as needed.
The fragility due to formatting etc. for the sanitizers is exactly
what we want, since any change in console output could be a bug. But
as you say, for other tests, it might not make much sense.
Thanks,
-- Marco
Powered by blists - more mailing lists