[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5555BA6A.50906@huawei.com>
Date: Fri, 15 May 2015 17:20:42 +0800
From: "Wangnan (F)" <wangnan0@...wei.com>
To: Alexei Starovoitov <ast@...mgrid.com>,
<linux-kernel@...r.kernel.org>
CC: lizefan 00213767 <lizefan@...wei.com>
Subject: Re: [BUG] kernel panic after bpf program removed.
在 2015/5/15 13:37, Alexei Starovoitov 写道:
> On 5/14/15 8:54 PM, Wangnan (F) wrote:
>> Hi Alexei Starovoitov and other,
>>
>> I triggered a kernel panic when developing my 'perf bpf' facility. The
>> call stack is listed at the bottom of
>> this mail.
>>
>> I attached two bpf programs on 'kmem_cache_free%return' and
>> '__alloc_pages_nodemask'. The programs is very simple.
>> The panic is raised after closing the bpf program and the perf event
>> file. Looks like the panic is caused
>> by racing between closing perf event fd and bpf program fd. I'm unable
>> to reproduce this problem with similar
>> operations.
>>
>> Following is the exact instruction cause the panic.
>
> thanks for the report.
> Looks like pointer 'prog == 0x6c0' is passed into bpf_prog_put,
> which means that event->tp_event was freed and memory reused before
> free_event_rcu() was called.
>
> I think it's not perf_event_fd racing with prog_fd, but rather
> with kprobe freeing:
> __free_event()
> event->destroy(event)
> perf_trace_destroy
> perf_trace_event_unreg
> which is dropping event->tp_event->perf_refcount
> that allows kprobe freeing to proceed in:
> unregister_kprobe_event
> trace_remove_event_call
> probe_remove_event_call
> and eventually tp_event to get freed.
>
> I think calling perf_event_free_bpf_prog()
> from __free_event() instead of free_event_rcu() will fix the race,
> but please double check my analysis.
> Also please send me a reproducer script. I'd like to see it crashing
> first before the fix and not crashing afterwards.
>
I triggered the problem with my 'perf bpf' patch series, and reproduced
once.
The bpf program is attached.
What I do is to use
# perf bpf record --object /root/sample_bpf_program.o -- sleep 4
to start recording, then press C-c before sleep finish after about 3
seconds.
The second call trace is identical to the previous one.
My environment is qemu with v4.1-rc3 kernel.
Thank you.
-------------------------------------------------
#include <uapi/linux/bpf.h>
#include <linux/version.h>
#include <uapi/linux/ptrace.h>
#define SEC(NAME) __attribute__((section(NAME), used))
static int (*bpf_map_delete_elem)(void *map, void *key) =
(void *) BPF_FUNC_map_delete_elem;
static int (*bpf_trace_printk)(const char *fmt, int fmt_size, ...) =
(void *) BPF_FUNC_trace_printk;
struct bpf_map_def {
unsigned int type;
unsigned int key_size;
unsigned int value_size;
unsigned int max_entries;
};
struct pair {
u64 val;
u64 ip;
};
struct bpf_map_def SEC("maps") my_map = {
.type = BPF_MAP_TYPE_HASH,
.key_size = sizeof(long),
.value_size = sizeof(struct pair),
.max_entries = 1000000,
};
struct bpf_map_def SEC("maps") my_map2 = {
.type = BPF_MAP_TYPE_HASH,
.key_size = sizeof(long),
.value_size = sizeof(struct pair),
.max_entries = 1000000,
};
SEC("cache_free=kmem_cache_free%return")
int bpf_prog1(struct pt_regs *ctx)
{
long ptr = ctx->r14;
bpf_map_delete_elem(&my_map2, &ptr);
return 0;
}
SEC("mybpfprog=__alloc_pages_nodemask")
int bpf_prog_my(struct pt_regs *ctx)
{
char fmt[] = "Haha\n";
long ptr = ctx->r14;
bpf_trace_printk(fmt, sizeof(fmt));
bpf_map_delete_elem(&my_map, &ptr);
return 0;
}
char _license[] SEC("license") = "GPL";
u32 _version SEC("version") = LINUX_VERSION_CODE;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists