[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <716adfa5-bd5d-3fe2-108c-ff24b2e81420@bytedance.com>
Date: Thu, 28 Sep 2023 11:29:51 +0800
From: Chuyi Zhou <zhouchuyi@...edance.com>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc: bpf@...r.kernel.org, ast@...nel.org, daniel@...earbox.net,
andrii@...nel.org, martin.lau@...nel.org, tj@...nel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH bpf-next v3 3/7] bpf: Introduce task open coded iterator
kfuncs
Hello,
在 2023/9/28 07:20, Andrii Nakryiko 写道:
> On Mon, Sep 25, 2023 at 3:56 AM Chuyi Zhou <zhouchuyi@...edance.com> wrote:
>>
>> This patch adds kfuncs bpf_iter_task_{new,next,destroy} which allow
>> creation and manipulation of struct bpf_iter_task in open-coded iterator
>> style. BPF programs can use these kfuncs or through bpf_for_each macro to
>> iterate all processes in the system.
>>
>> The API design keep consistent with SEC("iter/task"). bpf_iter_task_new()
>> accepts a specific task and iterating type which allows:
>> 1. iterating all process in the system
>>
>> 2. iterating all threads in the system
>>
>> 3. iterating all threads of a specific task
>> Here we also resuse enum bpf_iter_task_type and rename BPF_TASK_ITER_TID
>> to BPF_TASK_ITER_THREAD, rename BPF_TASK_ITER_TGID to BPF_TASK_ITER_PROC.
>>
>> The newly-added struct bpf_iter_task has a name collision with a selftest
>> for the seq_file task iter's bpf skel, so the selftests/bpf/progs file is
>> renamed in order to avoid the collision.
>>
>> Signed-off-by: Chuyi Zhou <zhouchuyi@...edance.com>
>> ---
>> include/linux/bpf.h | 8 +-
>> kernel/bpf/helpers.c | 3 +
>> kernel/bpf/task_iter.c | 96 ++++++++++++++++---
>> .../testing/selftests/bpf/bpf_experimental.h | 5 +
>> .../selftests/bpf/prog_tests/bpf_iter.c | 18 ++--
>> .../{bpf_iter_task.c => bpf_iter_tasks.c} | 0
>> 6 files changed, 106 insertions(+), 24 deletions(-)
>> rename tools/testing/selftests/bpf/progs/{bpf_iter_task.c => bpf_iter_tasks.c} (100%)
>>
>
> [...]
>
>> @@ -692,9 +692,9 @@ static int bpf_iter_fill_link_info(const struct bpf_iter_aux_info *aux, struct b
>> static void bpf_iter_task_show_fdinfo(const struct bpf_iter_aux_info *aux, struct seq_file *seq)
>> {
>> seq_printf(seq, "task_type:\t%s\n", iter_task_type_names[aux->task.type]);
>> - if (aux->task.type == BPF_TASK_ITER_TID)
>> + if (aux->task.type == BPF_TASK_ITER_THREAD)
>> seq_printf(seq, "tid:\t%u\n", aux->task.pid);
>> - else if (aux->task.type == BPF_TASK_ITER_TGID)
>> + else if (aux->task.type == BPF_TASK_ITER_PROC)
>> seq_printf(seq, "pid:\t%u\n", aux->task.pid);
>> }
>>
>> @@ -856,6 +856,80 @@ __bpf_kfunc void bpf_iter_css_task_destroy(struct bpf_iter_css_task *it)
>> bpf_mem_free(&bpf_global_ma, kit->css_it);
>> }
>>
>> +struct bpf_iter_task {
>> + __u64 __opaque[2];
>> + __u32 __opaque_int[1];
>
> this should be __u64 __opaque[3], because struct takes full 24 bytes
>
>> +} __attribute__((aligned(8)));
>> +
>> +struct bpf_iter_task_kern {
>> + struct task_struct *task;
>> + struct task_struct *pos;
>> + unsigned int type;
>> +} __attribute__((aligned(8)));
>> +
>> +__bpf_kfunc int bpf_iter_task_new(struct bpf_iter_task *it, struct task_struct *task, unsigned int type)
>
> nit: type -> flags, so we can add a bit more stuff, if necessary
>
>> +{
>> + struct bpf_iter_task_kern *kit = (void *)it;
>
> empty line after variable declarations
>
>> + BUILD_BUG_ON(sizeof(struct bpf_iter_task_kern) != sizeof(struct bpf_iter_task));
>> + BUILD_BUG_ON(__alignof__(struct bpf_iter_task_kern) !=
>> + __alignof__(struct bpf_iter_task));
>
> and I'd add empty line here to keep BUILD_BUG_ON block separate
>
>> + kit->task = kit->pos = NULL;
>> + switch (type) {
>> + case BPF_TASK_ITER_ALL:
>> + case BPF_TASK_ITER_PROC:
>> + case BPF_TASK_ITER_THREAD:
>> + break;
>> + default:
>> + return -EINVAL;
>> + }
>> +
>> + if (type == BPF_TASK_ITER_THREAD)
>> + kit->task = task;
>> + else
>> + kit->task = &init_task;
>> + kit->pos = kit->task;
>> + kit->type = type;
>> + return 0;
>> +}
>> +
>> +__bpf_kfunc struct task_struct *bpf_iter_task_next(struct bpf_iter_task *it)
>> +{
>> + struct bpf_iter_task_kern *kit = (void *)it;
>> + struct task_struct *pos;
>> + unsigned int type;
>> +
>> + type = kit->type;
>> + pos = kit->pos;
>> +
>> + if (!pos)
>> + goto out;
>> +
>> + if (type == BPF_TASK_ITER_PROC)
>> + goto get_next_task;
>> +
>> + kit->pos = next_thread(kit->pos);
>> + if (kit->pos == kit->task) {
>> + if (type == BPF_TASK_ITER_THREAD) {
>> + kit->pos = NULL;
>> + goto out;
>> + }
>> + } else
>> + goto out;
>> +
>> +get_next_task:
>> + kit->pos = next_task(kit->pos);
>> + kit->task = kit->pos;
>> + if (kit->pos == &init_task)
>> + kit->pos = NULL;
>
> I can't say I completely follow the logic (e.g., for
> BPF_TASK_ITER_PROC, why do we do next_task() on first next() call)?
> Can you elabore the expected behavior for various combinations of
> types and starting task argument?
>
Thanks for the review.
The expected behavior of current implementation is:
BPF_TASK_ITER_PROC:
init_task->first_process->second_process->...->last_process->init_task
We would exit before visiting init_task again.
BPF_TASK_ITER_THREAD:
group_task->first_thread->second_thread->...->last_thread->group_task
We would exit before visiting group_task again.
BPF_TASK_ITER_ALL:
init_task -> first_process -> second_process -> ...
| |
-> first_thread.. |
-> first_thread
Actually, every next() call, we would return the "pos" which was
prepared by previous next() call, and use next_task()/next_thread() to
update kit->pos. Once we meet the exit condition (next_task() return
init_task or next_thread() return group_task), we would update kit->pos
to NULL. In this way, when next() is called again, we will terminate the
iteration.
Here "kit->pos = NULL;" means we would return the last valid "pos" and
will return NULL in next call to exit from the iteration.
Am I miss something important?
Thanks.
Powered by blists - more mailing lists