lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <08e1c9d0-376f-d669-6fe8-559b2fbc2f2b@efficios.com>
Date:   Wed, 8 Feb 2023 21:06:58 -0500
From:   Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:     John Stultz <jstultz@...gle.com>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Yafang Shao <laoar.shao@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Network Development <netdev@...r.kernel.org>,
        bpf <bpf@...r.kernel.org>,
        "linux-perf-use." <linux-perf-users@...r.kernel.org>,
        Linux-Fsdevel <linux-fsdevel@...r.kernel.org>,
        linux-mm <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        kernel test robot <oliver.sang@...el.com>,
        kbuild test robot <lkp@...el.com>,
        Andrii Nakryiko <andrii@...nel.org>,
        David Hildenbrand <david@...hat.com>,
        Arnaldo Carvalho de Melo <arnaldo.melo@...il.com>,
        Andrii Nakryiko <andrii.nakryiko@...il.com>,
        Michal Miroslaw <mirq-linux@...e.qmqm.pl>,
        Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Matthew Wilcox <willy@...radead.org>,
        Al Viro <viro@...iv.linux.org.uk>,
        Kees Cook <keescook@...omium.org>,
        Petr Mladek <pmladek@...e.com>,
        Kajetan Puchalski <kajetan.puchalski@....com>,
        Lukasz Luba <lukasz.luba@....com>,
        Qais Yousef <qyousef@...gle.com>,
        Daniele Di Proietto <ddiproietto@...gle.com>
Subject: Re: [PATCH v2 7/7] tools/testing/selftests/bpf: replace open-coded 16
 with TASK_COMM_LEN

On 2023-02-08 19:54, John Stultz wrote:
> On Wed, Feb 8, 2023 at 4:11 PM Alexei Starovoitov
> <alexei.starovoitov@...il.com> wrote:
>>
>> On Wed, Feb 8, 2023 at 2:01 PM John Stultz <jstultz@...gle.com> wrote:
>>>
>>> On Sat, Nov 20, 2021 at 11:27:38AM +0000, Yafang Shao wrote:
>>>> As the sched:sched_switch tracepoint args are derived from the kernel,
>>>> we'd better make it same with the kernel. So the macro TASK_COMM_LEN is
>>>> converted to type enum, then all the BPF programs can get it through BTF.
>>>>
>>>> The BPF program which wants to use TASK_COMM_LEN should include the header
>>>> vmlinux.h. Regarding the test_stacktrace_map and test_tracepoint, as the
>>>> type defined in linux/bpf.h are also defined in vmlinux.h, so we don't
>>>> need to include linux/bpf.h again.
>>>>
>>>> Signed-off-by: Yafang Shao <laoar.shao@...il.com>
>>>> Acked-by: Andrii Nakryiko <andrii@...nel.org>
>>>> Acked-by: David Hildenbrand <david@...hat.com>
>>>> Cc: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
>>>> Cc: Arnaldo Carvalho de Melo <arnaldo.melo@...il.com>
>>>> Cc: Andrii Nakryiko <andrii.nakryiko@...il.com>
>>>> Cc: Michal Miroslaw <mirq-linux@...e.qmqm.pl>
>>>> Cc: Peter Zijlstra <peterz@...radead.org>
>>>> Cc: Steven Rostedt <rostedt@...dmis.org>
>>>> Cc: Matthew Wilcox <willy@...radead.org>
>>>> Cc: David Hildenbrand <david@...hat.com>
>>>> Cc: Al Viro <viro@...iv.linux.org.uk>
>>>> Cc: Kees Cook <keescook@...omium.org>
>>>> Cc: Petr Mladek <pmladek@...e.com>
>>>> ---
>>>>   include/linux/sched.h                                   | 9 +++++++--
>>>>   tools/testing/selftests/bpf/progs/test_stacktrace_map.c | 6 +++---
>>>>   tools/testing/selftests/bpf/progs/test_tracepoint.c     | 6 +++---
>>>>   3 files changed, 13 insertions(+), 8 deletions(-)
>>>
>>> Hey all,
>>>    I know this is a little late, but I recently got a report that
>>> this change was causiing older versions of perfetto to stop
>>> working.
>>>
>>> Apparently newer versions of perfetto has worked around this
>>> via the following changes:
>>>    https://android.googlesource.com/platform/external/perfetto/+/c717c93131b1b6e3705a11092a70ac47c78b731d%5E%21/
>>>    https://android.googlesource.com/platform/external/perfetto/+/160a504ad5c91a227e55f84d3e5d3fe22af7c2bb%5E%21/
>>>
>>> But for older versions of perfetto, reverting upstream commit
>>> 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16
>>> with TASK_COMM_LEN") is necessary to get it back to working.
>>>
>>> I haven't dug very far into the details, and obviously this doesn't
>>> break with the updated perfetto, but from a high level this does
>>> seem to be a breaking-userland regression.
>>>
>>> So I wanted to reach out to see if there was more context for this
>>> breakage? I don't want to raise a unnecessary stink if this was
>>> an unfortuante but forced situation.
>>
>> Let me understand what you're saying...
>>
>> The commit 3087c61ed2c4 did
>>
>> -/* Task command name length: */
>> -#define TASK_COMM_LEN                  16
>> +/*
>> + * Define the task command name length as enum, then it can be visible to
>> + * BPF programs.
>> + */
>> +enum {
>> +       TASK_COMM_LEN = 16,
>> +};
>>
>>
>> and that caused:
>>
>> cat /sys/kernel/debug/tracing/events/task/task_newtask/format
>>
>> to print
>> field:char comm[TASK_COMM_LEN];    offset:12;    size:16;    signed:0;
>> instead of
>> field:char comm[16];    offset:12;    size:16;    signed:0;
>>
>> so the ftrace parsing android tracing tool had to do:
>>
>> -  if (Match(type_and_name.c_str(), R"(char [a-zA-Z_]+\[[0-9]+\])")) {
>> +  if (Match(type_and_name.c_str(),
>> +            R"(char [a-zA-Z_][a-zA-Z_0-9]*\[[a-zA-Z_0-9]+\])")) {
>>
>> to workaround this change.
>> Right?
> 
> I believe so.
> 
>> And what are you proposing?
> 
> I'm not proposing anything. I was just wanting to understand more
> context around this, as it outwardly appears to be a user-breaking
> change, and that is usually not done, so I figured it was an issue
> worth raising.
> 
> If the debug/tracing/*/format output is in the murky not-really-abi
> space, that's fine, but I wanted to know if this was understood as
> something that may require userland updates or if this was a
> unexpected side-effect.

If you are looking at the root cause in the kernel code generating this:

kernel/trace/trace_events.c:f_show()

         /*
          * Smartly shows the array type(except dynamic array).
          * Normal:
          *      field:TYPE VAR
          * If TYPE := TYPE[LEN], it is shown:
          *      field:TYPE VAR[LEN]
          */

where it uses the content of field->type (a string) to format the VAR[LEN] part.

This in turn is the result of the definition of the
struct trace_event_fields done in:

include/trace/trace_events.h at stage 4, thus with the context of those macros defined:

include/trace/stages/stage4_event_fields.h:

#undef __array
#define __array(_type, _item, _len) {                                   \
         .type = #_type"["__stringify(_len)"]", .name = #_item,          \
         .size = sizeof(_type[_len]), .align = ALIGN_STRUCTFIELD(_type), \
         .is_signed = is_signed_type(_type), .filter_type = FILTER_OTHER },

I suspect the real culprit here is the use of __stringify(_len), which happens to work
on macros, but not on enum labels.

One possible solution to make this more robust would be to extend
struct trace_event_fields with one more field that indicates the length
of an array as an actual integer, without storing it in its stringified
form in the type, and do the formatting in f_show where it belongs.

This way everybody can stay happy and no ABI is broken.

Thoughts ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ