[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8449BBF3-E754-4ABC-BFEF-A8F264297F2D@fb.com>
Date: Fri, 17 May 2019 21:48:14 +0000
From: Song Liu <songliubraving@...com>
To: Alexei Starovoitov <ast@...com>
CC: Peter Zijlstra <peterz@...radead.org>,
Kairui Song <kasong@...hat.com>,
lkml <linux-kernel@...r.kernel.org>,
Kernel Team <Kernel-team@...com>,
"Josh Poimboeuf" <jpoimboe@...hat.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>
Subject: Re: Getting empty callchain from perf_callchain_kernel()
> On May 17, 2019, at 2:06 PM, Alexei Starovoitov <ast@...com> wrote:
>
> On 5/17/19 11:40 AM, Song Liu wrote:
>> +Alexei, Daniel, and bpf
>>
>>> On May 17, 2019, at 2:10 AM, Peter Zijlstra <peterz@...radead.org> wrote:
>>>
>>> On Fri, May 17, 2019 at 04:15:39PM +0800, Kairui Song wrote:
>>>> Hi, I think the actual problem is that bpf_get_stackid_tp (and maybe
>>>> some other bfp functions) is now broken, or, strating an unwind
>>>> directly inside a bpf program will end up strangely. It have following
>>>> kernel message:
>>>
>>> Urgh, what is that bpf_get_stackid_tp() doing to get the regs? I can't
>>> follow.
>>
>> I guess we need something like the following? (we should be able to
>> optimize the PER_CPU stuff).
>>
>> Thanks,
>> Song
>>
>>
>> diff --git i/kernel/trace/bpf_trace.c w/kernel/trace/bpf_trace.c
>> index f92d6ad5e080..c525149028a7 100644
>> --- i/kernel/trace/bpf_trace.c
>> +++ w/kernel/trace/bpf_trace.c
>> @@ -696,11 +696,13 @@ static const struct bpf_func_proto bpf_perf_event_output_proto_tp = {
>> .arg5_type = ARG_CONST_SIZE_OR_ZERO,
>> };
>>
>> +static DEFINE_PER_CPU(struct pt_regs, bpf_stackid_tp_regs);
>> BPF_CALL_3(bpf_get_stackid_tp, void *, tp_buff, struct bpf_map *, map,
>> u64, flags)
>> {
>> - struct pt_regs *regs = *(struct pt_regs **)tp_buff;
>> + struct pt_regs *regs = this_cpu_ptr(&bpf_stackid_tp_regs);
>>
>> + perf_fetch_caller_regs(regs);
>
> No. pt_regs is already passed in. It's the first argument.
> If we call perf_fetch_caller_regs() again the stack trace will be wrong.
> bpf prog should not see itself, interpreter or all the frames in between.
Thanks Alexei! I get it now.
In bpf_get_stackid_tp(), the pt_regs is get by dereferencing the first field
of tp_buff:
struct pt_regs *regs = *(struct pt_regs **)tp_buff;
tp_buff points to something like
struct sched_switch_args {
unsigned long long pad;
char prev_comm[16];
int prev_pid;
int prev_prio;
long long prev_state;
char next_comm[16];
int next_pid;
int next_prio;
};
where the first field "pad" is a pointer to pt_regs.
@Kairui, I think you confirmed that current code will give empty call trace
with ORC unwinder? If that's the case, can we add regs->ip back? (as in the
first email of this thread.
Thanks,
Song
Powered by blists - more mailing lists