lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 14 Oct 2021 08:42:05 -0400 (EDT)
From:   Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:     Yafang Shao <laoar.shao@...il.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        rostedt <rostedt@...dmis.org>
Cc:     kernel test robot <oliver.sang@...el.com>,
        0day robot <lkp@...el.com>, Petr Mladek <pmladek@...e.com>,
        Kees Cook <keescook@...omium.org>,
        Peter Zijlstra <peterz@...radead.org>,
        rostedt <rostedt@...dmis.org>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        lkp <lkp@...ts.01.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Valentin Schneider <valentin.schneider@....com>,
        Qiang Zhang <qiang.zhang@...driver.com>,
        robdclark@...omium.org, christian@...uner.io,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        bristot <bristot@...hat.com>,
        aubrey li <aubrey.li@...ux.intel.com>,
        yu c chen <yu.c.chen@...el.com>
Subject: Re: [sched.h] 317419b91e:
 perf-sanity-tests.Parse_sched_tracepoints_fields.fail

----- On Oct 14, 2021, at 5:24 AM, Yafang Shao laoar.shao@...il.com wrote:

> On Thu, Oct 14, 2021 at 3:08 PM kernel test robot <oliver.sang@...el.com> wrote:
>>
>>
>>
>> Greeting,
>>
>> FYI, we noticed the following commit (built with gcc-9):
>>
>> commit: 317419b91ef4eff4e2f046088201e4dc4065caa0 ("[PATCH v3 3/4] sched.h:
>> extend task comm from 16 to 24 for CONFIG_BASE_FULL")
>> url:
>> https://github.com/0day-ci/linux/commits/Yafang-Shao/task_struct-extend-task-comm-from-16-to-24-for-CONFIG_BASE_FULL/20211010-182548
>> base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git
>> 7fd2bf83d59a2d32e0d596c5d3e623b9a0e7e2d5
>>
>> in testcase: perf-sanity-tests
>> version: perf-x86_64-7fd2bf83d59a-1_20211010
>> with following parameters:
>>
>>         perf_compiler: clang
>>         ucode: 0xde
>>
>>
>>
>> on test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
>> with 32G memory
>>
>> caused below changes (please refer to attached dmesg/kmsg for entire
>> log/backtrace):
>>
>>
>>
>>
>> If you fix the issue, kindly add following tag
>> Reported-by: kernel test robot <oliver.sang@...el.com>
>>
>> 2021-10-13 18:00:46 sudo
>> /usr/src/perf_selftests-x86_64-rhel-8.3-317419b91ef4eff4e2f046088201e4dc4065caa0/tools/perf/perf
>> test 13
>> 13: DSO data reopen                                                 : Ok
>> 2021-10-13 18:00:46 sudo
>> /usr/src/perf_selftests-x86_64-rhel-8.3-317419b91ef4eff4e2f046088201e4dc4065caa0/tools/perf/perf
>> test 14
>> 14: Roundtrip evsel->name                                           : Ok
>> 2021-10-13 18:00:46 sudo
>> /usr/src/perf_selftests-x86_64-rhel-8.3-317419b91ef4eff4e2f046088201e4dc4065caa0/tools/perf/perf
>> test 15
>> 15: Parse sched tracepoints fields                                  : FAILED!
> 
> 
> That issue is caused by another hardcode 16 ...
> 
> Seems we should make some change as below,
> 
> diff --git a/tools/perf/tests/evsel-tp-sched.c
> b/tools/perf/tests/evsel-tp-sched.c
> index f9e34bd26cf3..401a737b1d85 100644
> --- a/tools/perf/tests/evsel-tp-sched.c
> +++ b/tools/perf/tests/evsel-tp-sched.c
> @@ -42,7 +42,7 @@ int test__perf_evsel__tp_sched_test(struct test
> *test __maybe_unused, int subtes
>                return -1;
>        }
> 
> -       if (evsel__test_field(evsel, "prev_comm", 16, false))
> +       if (evsel__test_field(evsel, "prev_comm", TASK_COMM_LEN, false))

tools/perf/tests/* contains userspace test programs. This means it gets the
TASK_COMM_LEN from the uapi. The fix you propose won't do any good here.

ftrace and perf access a description of the sched_switch tracepoint prev_comm field
from tracefs and store it into their respective traces. The size of this field is
derived from the kernel's TASK_COMM_LEN, which is changed by the patch triggering this
failure. Therefore, if we strictly consider the field layout as ABI, this is an ABI break.
However, if we consider that trace viewers should adapt to the changes in described event
field layout, then we should tweak this test program to accept larger values of prev_comm
field size.

A simple solution would be to tweak this test program so it can adapt to the size
change, and then figure out if any other relevant program out there notice the break.
If it happens that this ABI break is noticed by more than an in-tree test program, then
the kernel's ABI rules will require that this trace field size stays unchanged. This brings
up once more the whole topic of "Tracepoints ABI" which has been discussed repeatedly in
the past.

In short, because TRACE_EVENT exposes the tracepoint layout as ABI, if any trace viewer out
there expects a fixed-size prev_comm field for the sched_switch event, its size cannot be
changed.

Thanks,

Mathieu

>                ret = -1;
> 
>        if (evsel__test_field(evsel, "prev_pid", 4, true))
> @@ -54,7 +54,7 @@ int test__perf_evsel__tp_sched_test(struct test
> *test __maybe_unused, int subtes
>        if (evsel__test_field(evsel, "prev_state", sizeof(long), true))
>                ret = -1;
> 
> -       if (evsel__test_field(evsel, "next_comm", 16, false))
> +       if (evsel__test_field(evsel, "next_comm", TASK_COMM_LEN, false))
>                ret = -1;
> 
>        if (evsel__test_field(evsel, "next_pid", 4, true))
> @@ -72,7 +72,7 @@ int test__perf_evsel__tp_sched_test(struct test
> *test __maybe_unused, int subtes
>                return -1;
>        }
> 
> -       if (evsel__test_field(evsel, "comm", 16, false))
> +       if (evsel__test_field(evsel, "comm", TASK_COMM_LEN, false))
>                ret = -1;
> 
>        if (evsel__test_field(evsel, "pid", 4, true))
> 
> 
>> 2021-10-13 18:00:46 sudo
>> /usr/src/perf_selftests-x86_64-rhel-8.3-317419b91ef4eff4e2f046088201e4dc4065caa0/tools/perf/perf
>> test 16
>> 16: syscalls:sys_enter_openat event fields                          : Ok
>>
>>
>>
>> To reproduce:
>>
>>         git clone https://github.com/intel/lkp-tests.git
>>         cd lkp-tests
>>         sudo bin/lkp install job.yaml           # job file is attached in this email
>>         bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
>>         sudo bin/lkp run generated-yaml-file
>>
>>         # if come across any failure that blocks the test,
>>         # please remove ~/.lkp and /lkp dir to run from a clean state.
>>
>>
>>
>> ---
>> 0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
>> https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation
>>
>> Thanks,
>> Oliver Sang
>>
> 
> 
> --
> Thanks
> Yafang

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ