linux-kernel - Re: [PATCH] Documentation: kunit: provide guidance for testing many inputs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABVgOSnCkbs+5EEOq7gxUdQUpTL4vOPFvagJoMQ7t0dtg4DQeA@mail.gmail.com>
Date:   Sat, 7 Nov 2020 12:21:01 +0800
From:   David Gow <davidgow@...gle.com>
To:     Daniel Latypov <dlatypov@...gle.com>
Cc:     Brendan Higgins <brendanhiggins@...gle.com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        "open list:KERNEL SELFTEST FRAMEWORK" 
        <linux-kselftest@...r.kernel.org>,
        Shuah Khan <skhan@...uxfoundation.org>
Subject: Re: [PATCH] Documentation: kunit: provide guidance for testing many inputs

On Tue, Nov 3, 2020 at 5:37 AM Daniel Latypov <dlatypov@...gle.com> wrote:
>
> usage.rst goes into a detailed about faking out classes, but currently

Nit: a detailed what?

> lacks wording about how one might idiomatically test a range of inputs.
>
> Give an example of how one might test a hash function via macros/helper
> funcs and a table-driven test and very briefly discuss pros and cons.
>
> Also highlight the KUNIT_EXPECT_*_MSG() variants (that aren't mentioned
> elsewhere [1]) which are particularly useful in these situations.
>
> It is also criminally underused at the moment, only appearing in 2
> tests (both written by people involved in KUnit).
>
> [1] not even on
> https://www.kernel.org/doc/html/latest/dev-tools/kunit/api/test.html

I suspect we'll eventually want to document the _MSG variants here as
well, though it will bloat the page somewhat. In any case, it can be
left to a separate patch.

>
> Signed-off-by: Daniel Latypov <dlatypov@...gle.com>
> ---

Thanks for writing this -- it's definitely a common test pattern which
it'd be nice to encourage and explain a bit better.

Cheers,
-- David

>  Documentation/dev-tools/kunit/usage.rst | 66 +++++++++++++++++++++++++
>  1 file changed, 66 insertions(+)
>
> diff --git a/Documentation/dev-tools/kunit/usage.rst b/Documentation/dev-tools/kunit/usage.rst
> index 62142a47488c..317390df2b96 100644
> --- a/Documentation/dev-tools/kunit/usage.rst
> +++ b/Documentation/dev-tools/kunit/usage.rst
> @@ -451,6 +451,72 @@ We can now use it to test ``struct eeprom_buffer``:
>                 destroy_eeprom_buffer(ctx->eeprom_buffer);
>         }
>
> +Testing various inputs
> +----------------------
Nit: "various" isn't hugely descriptive here. Maybe something like
"Testing against multiple inputs" would be better?

> +
> +Testing just a few inputs might not be enough to have confidence that the code
> +works correctly, e.g. for a hash function.
> +
> +In such cases, it can be helpful to have a helper macro or function, e.g. this
> +fictitious example for ``md5sum(1)``
> +
> +.. code-block:: c
> +
> +       /* Note: the cast is to satisfy overly strict type-checking. */
> +       #define TEST_MD5(in, want) \
> +               md5sum(in, out); \
> +               KUNIT_EXPECT_STREQ_MSG(test, (char *)out, want, "md5sum(%s)", in);
> +
> +       char out[16];
> +       TEST_MD5("hello world",   "5eb63bbbe01eeed093cb22bb8f5acdc3");
> +       TEST_MD5("hello world!",  "fc3ff98e8c6a0d3087d515c0473f8677");
> +
> +Note the use of ``KUNIT_EXPECT_STREQ_MSG`` to give more context when it fails
> +and make it easier to track down. (Yes, in this example, ``want`` is likely
> +going to be unique enough on its own).
> +
> +The ``_MSG`` variants are even more useful when the same expectation is called
> +multiple times (in a loop or helper function) and thus the line number isn't
> +enough to identify what failed, like below.
> +
> +In some cases, it can be helpful to write a *table-driven test* instead, e.g.
> +
> +.. code-block:: c
> +
> +       int i;
> +       char out[16];
> +
> +       struct md5_test_case {
> +               const char *str;
> +               const char *md5;
> +       };
> +
> +       struct md5_test_case cases[] = {
> +               {
> +                       .str = "hello world",
> +                       .md5 = "5eb63bbbe01eeed093cb22bb8f5acdc3",
> +               },
> +               {
> +                       .str = "hello world!",
> +                       .md5 = "fc3ff98e8c6a0d3087d515c0473f8677",
> +               },
> +       };
> +       for (i = 0; i < ARRAY_SIZE(cases); ++i) {
> +               md5sum(cases[i].str, out);
> +               KUNIT_EXPECT_STREQ_MSG(test, (char *)out, cases[i].md5,
> +                                     "md5sum(%s)", cases[i].str);
> +       }
> +
> +
> +There's more boilerplate involved, but it can:
> +
> +* be more readable when there are multiple inputs/outputs thanks to field names,
> +
> +  * E.g. see ``fs/ext4/inode-test.c`` for an example of both.
> +* reduce duplication if test cases can be shared across multiple tests.
> +
> +  * E.g. if we had a magical ``undo_md5sum`` function, we could reuse ``cases``.
> +

This is a bit of a nitpick, but I don't think this is quite conveying
the usefulness of table-based testing. Maybe it's that a hypothetical
"undo_md5sum" is too unrealistic an example? Maybe, instead of having
both the macro-based and table-driven examples based around md5sum(),
the table-based one could use something more obviously invertible /
reusable, and include both in the example code. E.g, something akin to
toupper() and tolower() or some other conversion function. I think
having a better example here is probably more useful than having both
the table- and macro- driven examples test the same thing.


>  .. _kunit-on-non-uml:
>
>  KUnit on non-UML architectures
>
> base-commit: 77c8473edf7f7664137f555cfcdc8c460bbd947d
> --
> 2.29.1.341.ge80a0c044ae-goog
>