[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fefda00a-1a08-3a53-efbc-93c36292b77d@fb.com>
Date: Sun, 19 Apr 2020 00:01:55 -0700
From: Yonghong Song <yhs@...com>
To: Eelco Chaudron <echaudro@...hat.com>
CC: <bpf@...r.kernel.org>, <davem@...emloft.net>,
<netdev@...r.kernel.org>, <ast@...nel.org>, <daniel@...earbox.net>,
<kafai@...com>, <songliubraving@...com>, <andriin@...com>
Subject: Re: [RFC PATCH bpf-next 0/3] bpf: add tracing for XDP programs using
the BPF_PROG_TEST_RUN API
On 4/16/20 5:45 AM, Eelco Chaudron wrote:
>
>
> On 23 Mar 2020, at 23:47, Yonghong Song wrote:
>
>> On 3/18/20 6:06 AM, Eelco Chaudron wrote:
>>> I sent out this RFC to get an idea if the approach suggested here
>>> would be something other people would also like to see. In addition,
>>> this cover letter mentions some concerns and questions that need
>>> answers before we can move to an acceptable implementation.
>>>
>>> This patch adds support for tracing eBPF XDP programs that get
>>> executed using the __BPF_PROG_RUN syscall. This is done by switching
>>> from JIT (if enabled) to executing the program using the interpreter
>>> and record each executed instruction.
>>
>> Thanks for working on this! I think this is a useful feature
>> to do semi single step in a safe environment. The initial input,
>> e.g., packet or some other kernel context, may be captured
>> in production error path. People can use this to easily
>> do some post analysis. This feature can also be used for
>> initial single-step debugging with better bpftool support.
>>
>>>
>>> For now, the execution history is printed to the kernel ring buffer
>>> using pr_info(), the final version should have enough data stored in a
>>> user-supplied buffer to reconstruct this output. This should probably
>>> be part of bpftool, i.e. dump a similar output, and the ability to
>>> store all this in an elf-like format for dumping/analyzing/replaying
>>> at a later stage.
>>>
>>> This patch does not dump the XDP packet content before and after
>>> execution, however, this data is available to the caller of the API.
>>
>> I would like to see the feature is implemented in a way to apply
>> to all existing test_run program types and extensible to future
>> program types.
>
> Yes, this makes sense, but as I’m only familiar with the XDP part, I
> focused on that.
That is okay. Let us first focus on xdp and then to change it to make it
more general before landing. During implementation you want to keep
anything not related to xdp as separate functions so they can be
reused.
>
>> There are different ways to send data back to user. User buffer
>> is one way, ring buffer is another way, seq_file can also be used.
>> Performance is not a concern here, so we can choose the one with best
>> usability.
>
> As we need a buffer the easiest way would be to supply a user buffer. I
> guess a raw perf buffer might also work, but the API might get complex…
> I’ll dig into this a bit for the next RFC.
You may use xdp's perf_event_output. Not sure whether it can serve
your purpose or not. If not, I think user buffer is okay.
>
>>>
>>> The __bpf_prog_run_trace() interpreter is a copy of __bpf_prog_run()
>>> and we probably need a smarter way to re-use the code rather than a
>>> blind copy with some changes.
>>
>> Yes, reusing the code is a must. Using existing interpreter framework
>> is the easiest for semi single step support.
>
> Any idea how to do it cleanly? I guess I could move the interpreter code
> out of the core file and include it twice.
I think refactor to a different file is totally okay. You can then pass
different callback functions from test_run and from interpreter call.
>
>>> Enabling the interpreter opens up the kernel for spectre variant 2,
>>> guess that's why the BPF_JIT_ALWAYS_ON option was introduced (commit
>>> 290af86629b2). Enabling it for debugging in the field does not sound
>>> like an option (talking to people doing kernel distributions).
>>> Any idea how to work around this (lfence before any call this will
>>> slow down, but I guess for debugging this does not matter)? I need to
>>> research this more as I'm no expert in this area. But I think this
>>> needs to be solved as I see this as a show stopper. So any input is
>>> welcome.
>>
>> lfence for indirect call is okay here for test_run. Just need to be
>> careful to no introduce any performance penalty for non-test-run
>> prog run.
>
> My idea here was to do it at compile time and only if the interpreter
> was disabled.
This should be okay. If people do not have BPF_JIT_ALWAYS_ON, that
probably means they are not worried about interpreter indirect
calls...
>
>>>
>>> To allow bpf_call support for tracing currently the general
>>> interpreter is enabled. See the fixup_call_args() function for why
>>> this is needed. We might need to find a way to fix this (see the above
>>> section on spectre).
>>>
>>> Signed-off-by: Eelco Chaudron <echaudro@...hat.com>
>>>
>
> One final question did you (or anyone else) looked at the actual code
> and have some tips, thinks look at?
I briefly looked at the code, but did not go through details.
Next time will try to look at more details.
>
>
> I’ll try to do another RFC, cleaning up the duplicate interpreter code,
> sent the actual trace data to userspace. Will hack some userspace
> decoder together, or maybe even start integrating it in bpftool (if not
> it will be part of the follow on RFC).
>
Powered by blists - more mailing lists