[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190211084916.GB62722@gmail.com>
Date: Mon, 11 Feb 2019 09:49:16 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Mark Brown <broonie@...nel.org>
Cc: Shuah Khan <shuah@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H . Peter Anvin" <hpa@...or.com>,
Andy Lutomirski <luto@...capital.net>,
linux-kernel@...r.kernel.org, x86@...nel.org,
linux-kselftest@...r.kernel.org, Dan Rue <dan.rue@...aro.org>
Subject: Re: [PATCH 2/2] selftests/x86/fsgsbase: Default to trying to run the
test repeatedly
* Mark Brown <broonie@...nel.org> wrote:
> In automated testing it has been found that on many systems the fsgsbase
> test fails intermittently. This was reported and discussed a while
> back:
>
> https://lore.kernel.org/lkml/20180126153631.ha7yc33fj5uhitjo@xps/
>
> with the analysis concluding that this is a hardware issue affecting a
> subset of systems but no fix has been merged as yet. As well as the
> actual problem found by testing the intermittent test failure is causing
> issues for the people doing the automated testing due to the noise.
>
> In order to make the testing stable modify the test program to iterate
> through the test repeatedly, choosing 5000 iterations based on prior
> reports and local testing. This unfortunately greatly increases the
> execution time for the selftests when things succeed which isn't great,
> in my local tests on a range of systems it pushes the execution time up
> to approximately a minute when no failures are encountered.
>
> Reported-by: Dan Rue <dan.rue@...aro.org>
> Signed-off-by: Mark Brown <broonie@...nel.org>
> ---
> tools/testing/selftests/x86/fsgsbase.c | 27 +++++++++++++++++++++++++-
> 1 file changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/x86/fsgsbase.c b/tools/testing/selftests/x86/fsgsbase.c
> index 6cda6daa1f8c..83410749ff1f 100644
> --- a/tools/testing/selftests/x86/fsgsbase.c
> +++ b/tools/testing/selftests/x86/fsgsbase.c
> @@ -379,7 +379,7 @@ static void test_unexpected_base(void)
> }
> }
>
> -int main()
> +int test()
> {
> pthread_t thread;
>
> @@ -437,3 +437,28 @@ int main()
>
> return nerrs == 0 ? 0 : 1;
> }
> +
> +int main()
> +{
> + int tries = 5000;
> + int i;
> +
> + if (tries > 1)
> + quiet = true;
> +
> + for (i = 0; i < tries; i++) {
> + if (test() != 0)
> + break;
> + }
> +
> + if (quiet) {
> + if (nerrs) {
> + printf("[FAIL] %d errors detected in %d tries\n",
> + nerrs, i + 1);
> + } else {
> + printf("[PASS] %d runs succeeded\n", i);
> + }
> + }
> +
> + return nerrs == 0 ? 0 : 1;
> +}
So this isn't very user-friendly either, previously it would run a
testcase and immediately provide output.
Now it's just starting and 'hanging':
galatea:~/linux/linux/tools/testing/selftests/x86> ./fsgsbase_64
I got bored and Ctrl-C-ed it after ~30 seconds.
How long is this supposed to run, and why isn't the user informed?
Also, testcases should really be short, so I think a better approach
would be to thread the test-case and start an instance on every CPU. That
should also excercise SMP bugs, if any.
Thanks,
Ingo
Powered by blists - more mailing lists