[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210309153504.0b06ded1@gandalf.local.home>
Date: Tue, 9 Mar 2021 15:35:04 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: David Ahern <dsahern@...il.com>
Cc: Tony Lu <tonylu@...ux.alibaba.com>, davem@...emloft.net,
mingo@...hat.com, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] net: add net namespace inode for all net_dev events
On Tue, 9 Mar 2021 13:17:23 -0700
David Ahern <dsahern@...il.com> wrote:
> On 3/9/21 1:02 PM, Steven Rostedt wrote:
> > On Tue, 9 Mar 2021 12:53:37 -0700
> > David Ahern <dsahern@...il.com> wrote:
> >
> >> Changing the order of the fields will impact any bpf programs expecting
> >> the existing format
> >
> > I thought bpf programs were not API. And why are they not parsing this
> > information? They have these offsets hard coded???? Why would they do that!
> > The information to extract the data where ever it is has been there from
> > day 1! Way before BPF ever had access to trace events.
>
> BPF programs attached to a tracepoint are passed a context - a structure
> based on the format for the tracepoint. To take an in-tree example, look
> at samples/bpf/offwaketime_kern.c:
>
> ...
>
> /* taken from /sys/kernel/debug/tracing/events/sched/sched_switch/format */
> struct sched_switch_args {
> unsigned long long pad;
> char prev_comm[16];
> int prev_pid;
> int prev_prio;
> long long prev_state;
> char next_comm[16];
> int next_pid;
> int next_prio;
> };
> SEC("tracepoint/sched/sched_switch")
> int oncpu(struct sched_switch_args *ctx)
> {
>
> ...
>
> Production systems do not typically have toolchains installed, so
> dynamic generation of the program based on the 'format' file on the
> running system is not realistic. That means creating the programs on a
> development machine and installing on the production box. Further, there
> is an expectation that a bpf program compiled against version X works on
> version Y. Changing the order of the fields will break such programs in
> non-obvious ways.
The size of the fields and order changes all the time in various events. I
recommend doing so *all the time*. If you upgrade a kernel, then all the bpf
programs you have for that kernel should also be updated. You can't rely on
fields being the same, size or order. The best you can do is expect the
field to continue to exist, and that's not even a guarantee.
I'm not sure how that sample is used. I can't find "oncpu()" anywhere in
that directory besides where it is defined, and I wouldn't think a bpf
program would just blindly map the fields without verifying them.
-- Steve
Powered by blists - more mailing lists