[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c2a01486-9b4f-4bf0-91a6-b325e66b4883@fb.com>
Date: Mon, 23 Apr 2018 17:06:15 -0700
From: Yonghong Song <yhs@...com>
To: Peter Zijlstra <peterz@...radead.org>,
Song Liu <songliubraving@...com>
CC: <netdev@...r.kernel.org>, <ast@...nel.org>, <daniel@...earbox.net>,
<kernel-team@...com>, <hannes@...xchg.org>, <qinteng@...com>
Subject: Re: [PATCH bpf-next v4 1/2] bpf: extend stackmap to save
binary_build_id+offset instead of address
Hi, Peter,
I have a question regarding to one of your comments below.
On 3/12/18 3:01 PM, Peter Zijlstra wrote:
> On Mon, Mar 12, 2018 at 01:39:56PM -0700, Song Liu wrote:
>> +static void stack_map_get_build_id_offset(struct bpf_map *map,
>> + struct stack_map_bucket *bucket,
>> + u64 *ips, u32 trace_nr)
>> +{
>> + int i;
>> + struct vm_area_struct *vma;
>> + struct bpf_stack_build_id *id_offs;
>> +
>> + bucket->nr = trace_nr;
>> + id_offs = (struct bpf_stack_build_id *)bucket->data;
>> +
>> + if (!current || !current->mm ||
>> + down_read_trylock(¤t->mm->mmap_sem) == 0) {
>
> You probably want an in_nmi() before the down_read_trylock(). Doing
> up_read() is an absolute no-no from NMI context.
The below is the final code from Song:
/*
* We cannot do up_read() in nmi context, so build_id lookup is
* only supported for non-nmi events. If at some point, it is
* possible to run find_vma() without taking the semaphore, we
* would like to allow build_id lookup in nmi context.
*
* Same fallback is used for kernel stack (!user) on a stackmap
* with build_id.
*/
if (!user || !current || !current->mm || in_nmi() ||
down_read_trylock(¤t->mm->mmap_sem) == 0) {
/* cannot access current->mm, fall back to ips */
for (i = 0; i < trace_nr; i++) {
id_offs[i].status = BPF_STACK_BUILD_ID_IP;
id_offs[i].ip = ips[i];
}
return;
}
....
>
> And IIUC its 'trivial' to use this stuff with hardware counters.
Here, you mentioned that it was 'trivial' to use buildid thing with
hardware counters, if I interpreted correctly. However, the hardware
counter overflow will trigger NMI. Based on the above logic,
it will default to old IP only behavior.
Could you explain a little more how to get buildid with hardware
counter overflow events?
Thanks!
>
>> + /* cannot access current->mm, fall back to ips */
>> + for (i = 0; i < trace_nr; i++) {
>> + id_offs[i].status = BPF_STACK_BUILD_ID_IP;
>> + id_offs[i].ip = ips[i];
>> + }
>> + return;
>> + }
>> +
>> + for (i = 0; i < trace_nr; i++) {
>> + vma = find_vma(current->mm, ips[i]);
>> + if (!vma || stack_map_get_build_id(vma, id_offs[i].build_id)) {
>> + /* per entry fall back to ips */
>> + id_offs[i].status = BPF_STACK_BUILD_ID_IP;
>> + id_offs[i].ip = ips[i];
>> + continue;
>> + }
>> + id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + ips[i]
>> + - vma->vm_start;
>> + id_offs[i].status = BPF_STACK_BUILD_ID_VALID;
>> + }
>> + up_read(¤t->mm->mmap_sem);
>> +}
>
Powered by blists - more mailing lists