[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzZij0uXAvvkNJQNOF=fu44Bg+SsxX7x3WWg2+5wsOATrQ@mail.gmail.com>
Date: Fri, 1 May 2020 12:01:08 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Yonghong Song <yhs@...com>
Cc: Andrii Nakryiko <andriin@...com>, bpf <bpf@...r.kernel.org>,
Martin KaFai Lau <kafai@...com>,
Networking <netdev@...r.kernel.org>,
Alexei Starovoitov <ast@...com>,
Daniel Borkmann <daniel@...earbox.net>,
Kernel Team <kernel-team@...com>
Subject: Re: [PATCH bpf-next v1 11/19] bpf: add task and task/file targets
On Fri, May 1, 2020 at 10:23 AM Yonghong Song <yhs@...com> wrote:
>
>
>
> On 4/29/20 7:08 PM, Andrii Nakryiko wrote:
> > On Mon, Apr 27, 2020 at 1:17 PM Yonghong Song <yhs@...com> wrote:
> >>
> >> Only the tasks belonging to "current" pid namespace
> >> are enumerated.
> >>
> >> For task/file target, the bpf program will have access to
> >> struct task_struct *task
> >> u32 fd
> >> struct file *file
> >> where fd/file is an open file for the task.
> >>
> >> Signed-off-by: Yonghong Song <yhs@...com>
> >> ---
> >> kernel/bpf/Makefile | 2 +-
> >> kernel/bpf/task_iter.c | 319 +++++++++++++++++++++++++++++++++++++++++
> >> 2 files changed, 320 insertions(+), 1 deletion(-)
> >> create mode 100644 kernel/bpf/task_iter.c
> >>
> >
> > [...]
> >
> >> +static void *task_seq_start(struct seq_file *seq, loff_t *pos)
> >> +{
> >> + struct bpf_iter_seq_task_info *info = seq->private;
> >> + struct task_struct *task;
> >> + u32 id = info->id;
> >> +
> >> + if (*pos == 0)
> >> + info->ns = task_active_pid_ns(current);
> >
> > I wonder why pid namespace is set in start() callback each time, while
> > net_ns was set once when seq_file is created. I think it should be
> > consistent, no? Either pid_ns is another feature and is set
> > consistently just once using the context of the process that creates
> > seq_file, or net_ns could be set using the same method without
> > bpf_iter infra knowing about this feature? Or there are some
> > non-obvious aspects which make pid_ns easier to work with?
> >
> > Either way, process read()'ing seq_file might be different than
> > process open()'ing seq_file, so they might have different namespaces.
> > We need to decide explicitly which context should be used and do it
> > consistently.
>
> Good point. for networking case, the `net` namespace is locked
> at seq_file open stage and later on it is used for seq_read().
>
> I think I should do the same thing, locking down pid namespace
> at open.
Yeah, I think it's a good idea.
>
> >
> >> +
> >> + task = task_seq_get_next(info->ns, &id);
> >> + if (!task)
> >> + return NULL;
> >> +
> >> + ++*pos;
> >> + info->task = task;
> >> + info->id = id;
> >> +
> >> + return task;
> >> +}
> >> +
> >> +static void *task_seq_next(struct seq_file *seq, void *v, loff_t *pos)
> >> +{
> >> + struct bpf_iter_seq_task_info *info = seq->private;
> >> + struct task_struct *task;
> >> +
> >> + ++*pos;
> >> + ++info->id;
> >
> > this would make iterator skip pid 0? Is that by design?
>
> The start will try to find pid 0. That means start will never
> return SEQ_START_TOKEN since the bpf program won't be called any way.
Never mind, I confused task_seq_next() and task_seq_get_next() :)
>
> >
> >> + task = task_seq_get_next(info->ns, &info->id);
> >> + if (!task)
> >> + return NULL;
> >> +
> >> + put_task_struct(info->task);
> >
> > on very first iteration info->task might be NULL, right?
>
> Even the first iteration info->task is not NULL. The start()
> will forcefully try to find the first real task from idr number 0.
>
Right, goes to same confusion as above, sorry.
> >
> >> + info->task = task;
> >> + return task;
> >> +}
> >> +
> >> +struct bpf_iter__task {
> >> + __bpf_md_ptr(struct bpf_iter_meta *, meta);
> >> + __bpf_md_ptr(struct task_struct *, task);
> >> +};
> >> +
> >> +int __init __bpf_iter__task(struct bpf_iter_meta *meta, struct task_struct *task)
> >> +{
> >> + return 0;
> >> +}
> >> +
> >> +static int task_seq_show(struct seq_file *seq, void *v)
> >> +{
> >> + struct bpf_iter_meta meta;
> >> + struct bpf_iter__task ctx;
> >> + struct bpf_prog *prog;
> >> + int ret = 0;
> >> +
> >> + prog = bpf_iter_get_prog(seq, sizeof(struct bpf_iter_seq_task_info),
> >> + &meta.session_id, &meta.seq_num,
> >> + v == (void *)0);
> >> + if (prog) {
> >
> > can it happen that prog is NULL?
>
> Yes, this function is shared between show() and stop().
> The stop() function might be called multiple times since
> user can repeatedly try read() although there is nothing
> there, in which case, the seq_ops will be just
> start() and stop().
Ah, right, NULL case after end of iteration, got it.
>
> >
> >
> >> + meta.seq = seq;
> >> + ctx.meta = &meta;
> >> + ctx.task = v;
> >> + ret = bpf_iter_run_prog(prog, &ctx);
> >> + }
> >> +
> >> + return ret == 0 ? 0 : -EINVAL;
> >> +}
> >> +
> >> +static void task_seq_stop(struct seq_file *seq, void *v)
> >> +{
> >> + struct bpf_iter_seq_task_info *info = seq->private;
> >> +
> >> + if (!v)
> >> + task_seq_show(seq, v);
> >
> > hmm... show() called from stop()? what's the case where this is necessary?
>
> I will refactor it better. This is to invoke bpf program
> in stop() with NULL object to signal the end of
> iteration.
>
> >> +
> >> + if (info->task) {
> >> + put_task_struct(info->task);
> >> + info->task = NULL;
> >> + }
> >> +}
> >> +
> >
> > [...]
> >
Powered by blists - more mailing lists