linux-kernel - Re: perf tool: About tests debug level

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAP-5=fUxQZ+rxLEn6jeRNVMf48BaPNdaUdoMs8LY4P-GROiOnw@mail.gmail.com>
Date:   Mon, 21 Jun 2021 22:04:46 -0700
From:   Ian Rogers <irogers@...gle.com>
To:     John Garry <john.garry@...wei.com>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Jiri Olsa <jolsa@...hat.com>,
        "linux-perf-users@...r.kernel.org" <linux-perf-users@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Jin Yao <yao.jin@...ux.intel.com>
Subject: Re: perf tool: About tests debug level

On Mon, Jun 21, 2021 at 4:26 AM John Garry <john.garry@...wei.com> wrote:
>
> Hi guys,
>
> I noticed that when it was discovered recently that the new icelake JSON
> did not pass "PMU events" test, running the test with -v makes pr_err()
> and pr_debug() come out at the same level, so it's hard to distinguish
> the important logs.
>
> Here is a link:
> https://lore.kernel.org/lkml/YLdq%2FH8CXYgHWzCL@kernel.org/
>
> And here is an extract:
>
> parsing '(cpu_clk_unhalted.thread / cpu_clk_unhalted.ref_tsc) * msr@tsc@
> / 1000000000 / duration_time'
> parsing '( ( 1 * ( fp_arith_inst_retired.scalar_single +
> fp_arith_inst_retired.scalar_double ) + 2 *
> fp_arith_inst_retired.128b_packed_double + 4 * (
> fp_arith_inst_retired.128b_packed_single +
> fp_arith_inst_retired.256b_packed_double ) + 8 * (
> fp_arith_inst_retired.256b_packed_single +
> fp_arith_inst_retired.512b_packed_double ) + 16 *
> fp_arith_inst_retired.512b_packed_single ) / 1000000000 ) / duration_time'
> parsing 'cpu_clk_unhalted.thread / cpu_clk_unhalted.ref_tsc'
> parsing '1 - cpu_clk_unhalted.one_thread_active /
> cpu_clk_unhalted.ref_distributed'
> parsing 'cpu_clk_unhalted.thread:k / cpu_clk_unhalted.thread'
> parsing '( 64 * ( uncore_imc@..._count_read@ +
> uncore_imc@..._count_write@ ) / 1000000000 ) / duration_time'
> parsing '1000000000 * ( cha@...nt\=0x36\,umask\=0x21\,config\=0x40433@ /
> cha@...nt\=0x35\,umask\=0x21\,config\=0x40433@ ) / ( cha_0@...nt\=0x0@ /
> duration_time )'
> parsing 'cha@...nt\=0x36\,umask\=0x21\,config\=0x40433@ /
> cha@...nt\=0x36\,umask\=0x21\,config\=0x40433\,thresh\=1@'
> parsing '( 1000000000 * (
> cha@...nt\=0x36\,umask\=0x21\,config\=0x40433@...m /
> cha@...nt\=0x35\,umask\=0x21\,config\=0x40433@...m ) / cha_0@...nt\=0x0@
> )' [pr_debug]
> check_parse_fake failed [pr_err]
> test child finished with -1
> ---- end ----
>
> I annotated in [], above
>
> As for another issue, if you consider "Parse and process metrics", debug
> from core code comes out at same level as test code, e.g. with -v, we
> see pr_debug() from test code and core code. Again, this makes it hard
> to distinguish various levels. As an illustration, here I hack the code
> to fail a test:
>
> sudo ./perf test -v 68
> 68: Parse and process metrics                                       :
> --- start ---
> test child forked, pid 9747
> metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
> found event inst_retired.any verbose=1 pr_debug
> found event inst_retired.any verbose=1 pr_err
> found event cpu_clk_unhalted.thread verbose=1 pr_debug
> found event cpu_clk_unhalted.thread verbose=1 pr_err
> adding {inst_retired.any,cpu_clk_unhalted.thread}:W
> FAILED tests/parse-metric.c:223 IPC failed, wrong ratio
> FAILED tests/parse-metric.c:374 IPC failed
> test child finished with -1
> ---- end ----
> Parse and process metrics: FAILED!
>
> Note that the "FAILED" messages from the test code come from pr_debug().
>
> In a way, I feel that pr_debug()/err from the test is more important
> than pr_debug() from the core code (when running a test).
>
> Any opinion on this or how to improve (if anyone agrees with me)? Or am
> I missing something? Or is it not so important?

Hi John,

I think the issue is that in the parsing you don't know it's broken
until something goes wrong. Putting everything on pr_err would cause
spam in the not broken case. Improving the parsing error handling is a
big task with lex and yacc to some extent getting in the way. Perhaps
a middle way is to have a parameter to the parser that logs more, and
recursively call this in the parser when parsing fails. I guess there
is also a danger of a performance hit.

Thanks,
Ian

> Thanks,
> John