[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <75985227-01a9-3ca6-6a72-3f50dd3f1a45@fb.com>
Date: Thu, 9 Apr 2020 23:41:51 -0700
From: Yonghong Song <yhs@...com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
CC: Andrii Nakryiko <andriin@...com>, <bpf@...r.kernel.org>,
Martin KaFai Lau <kafai@...com>, <netdev@...r.kernel.org>,
Alexei Starovoitov <ast@...com>,
Daniel Borkmann <daniel@...earbox.net>, <kernel-team@...com>
Subject: Re: [RFC PATCH bpf-next 15/16] tools/bpf: selftests: add dumper progs
for bpf_map/task/task_file
On 4/9/20 8:33 PM, Alexei Starovoitov wrote:
> On Wed, Apr 08, 2020 at 04:25:38PM -0700, Yonghong Song wrote:
>> For task/file, the dumper prints out:
>> $ cat /sys/kernel/bpfdump/task/file/my1
>> tgid gid fd file
>> 1 1 0 ffffffff95c97600
>> 1 1 1 ffffffff95c97600
>> 1 1 2 ffffffff95c97600
>> ....
>> 1895 1895 255 ffffffff95c8fe00
>> 1932 1932 0 ffffffff95c8fe00
>> 1932 1932 1 ffffffff95c8fe00
>> 1932 1932 2 ffffffff95c8fe00
>> 1932 1932 3 ffffffff95c185c0
> ...
>> +SEC("dump//sys/kernel/bpfdump/task/file")
>> +int BPF_PROG(dump_tasks, struct task_struct *task, __u32 fd, struct file *file,
>> + struct seq_file *seq, u64 seq_num)
>> +{
>> + static char const banner[] = " tgid gid fd file\n";
>> + static char const fmt1[] = "%8d %8d";
>> + static char const fmt2[] = " %8d %lx\n";
>> +
>> + if (seq_num == 0)
>> + bpf_seq_printf(seq, banner, sizeof(banner));
>> +
>> + bpf_seq_printf(seq, fmt1, sizeof(fmt1), task->tgid, task->pid);
>> + bpf_seq_printf(seq, fmt2, sizeof(fmt2), fd, (long)file->f_op);
>> + return 0;
>> +}
>
> I wonder what is the speed of walking all files in all tasks with an empty
> program? If it's fast I can imagine a million use cases for such searching bpf
> prog. Like finding which task owns particular socket. This could be a massive
> feature.
>
> With one redundant spin_lock removed it seems it will be one spin_lock per prog
> invocation? May be eventually it can be amortized within seq_file iterating
> logic. Would be really awesome if the cost is just refcnt ++/-- per call and
> rcu_read_lock.
The main seq_read() loop is below:
while (1) {
size_t offs = m->count;
loff_t pos = m->index;
p = m->op->next(m, p, &m->index);
if (pos == m->index)
/* Buggy ->next function */
m->index++;
if (!p || IS_ERR(p)) {
err = PTR_ERR(p);
break;
}
if (m->count >= size)
break;
err = m->op->show(m, p);
if (seq_has_overflowed(m) || err) {
m->count = offs;
if (likely(err <= 0))
break;
}
}
If we remove the spin_lock() as in another email comment,
we won't have spin_lock() in seq_ops->next() function, only
refcnt ++/-- and rcu_read_{lock, unlock}s. The seq_ops->show() does
not have any spin_lock() either.
I have not got time to do a perf measurement yet.
Will do in the next revision.
Powered by blists - more mailing lists