[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2eb4b72dc5578407715e91f87116d2385598fa82.camel@fejes.dev>
Date: Mon, 28 Apr 2025 12:20:06 +0200
From: Ferenc Fejes <ferenc@...es.dev>
To: Ido Schimmel <idosch@...dia.com>, dsahern@...il.com
Cc: netdev <netdev@...r.kernel.org>, kuniyu@...zon.com
Subject: Re: [question] robust netns association with fib4 lookup
On Fri, 2025-04-25 at 21:17 +0300, Ido Schimmel wrote:
> On Thu, Apr 24, 2025 at 01:33:08PM +0200, Ferenc Fejes wrote:
> > Hi,
> >
> > tl;dr: I want to trace fib4 lookups within a network namespace with eBPF.
> > This
> > works well with fib6, as the struct net ptr passed as an argument to
> > fib6_table_lookup [0], so I can read the inode from it and pass it to
> > userspace.
> >
> >
> > Additional context. I'm working on a fib table and fib rule lookup tracer
> > application that hooks fib_table_lookup/fib6_table_lookup and
> > fib_rules_lookup
> > with fexit eBPF probes and gathers useful data from the struct flowi4 and
> > flowi6
> > used for the lookup as well as the resulting nexthop (gw, seg6, mpls tunnel)
> > if
> > the lookup is successful. If this works, my plan is to extend it to
> > neighbour,
> > fdb and mdb lookups.
> >
> > Tracepoints exist for fib lookups v4 [1] and v6 [2] but in my tracer I would
> > like to have netns filtering. For example: "check unsuccessful fib4 rule and
> > table lookups in netns foo". Unfortunately I can't find a reliable way to
> > associate netns info with fib4 lookups. The main problems are as follows.
> >
> > Unlike fib6_table_lookup for v6, fib_table_lookup for v4 does not have a
> > struct
> > net argument. This makes sense, as struct net is not needed there. But
> > without
> > it, the netns association is not as easy as in the v6 case.
> >
> > On the other hand, fib_lookup [3], which in most cases calls
> > fib_table_lookup,
> > has a struct net parameter. Even better, there is the struct fib_result ptr
> > returned by fib_table_lookup. This would be the perfect candidate to hook
> > into,
> > but unfortunately it is an inline function.
> >
> > If there are custom fib rules in the netns, __fib_lookup [4] is called,
> > which is
> > hookable. This has all the necessary info like netns, table and result. To
> > use
> > this I have to add the custom rule to the traced netns and remove it
> > immediately. This will enforce the __fib_lookup codepath. But I feel that at
> > some point this bug(?) will be fixed and the kernel will notice the absence
> > of
> > custom rules and switch back to the original codepath.
> >
> > But this option is useless for tracing unsuccessful lookups. The stack looks
> > like this:
> > __fib_lookup <-- netns info available
> > fib_rules_lookup <-- losing netns info... :-(
> > fib4_rule_action <-- unsuccessful result available
> > fib_table_lookup <-- source of unsuccessful result
> >
> > My current workaround is to restore the netns info using the struct flowi4
> > pointer. When we have the stack above, I use an eBPF hashmap and use the
> > flowi4
> > pointer as the key and netns as the value. Then in the fib_table_lookup I
> > look
> > up the netns id based on the value of the flowi4 pointer. Since this is the
> > common case, it works, but looks like fib_table_lookup is called from other
> > places as well (even its rare).
> >
> > Is there any other way to get the netns info for fib4 lookups? If not, would
> > it
> > be worth an RFC to pass the struct net argument to fib_table_lookup as well,
> > as
> > is currently done in fib6_table_lookup?
>
> I think it makes sense to make both tracepoints similar and pass the net
> argument to trace_fib_table_lookup()
Thank you for looking into it.
>
> > Unfortunately this includes some callers to fib_table_lookup. The
> > netns id would also be presented in the existing tracepoints ([1] and
> > [2]). Thanks in advance for any suggestion.
>
> By "netns id" you mean the netns cookie? It seems that some TCP trace
> events already expose it (see include/trace/events/tcp.h). It would be
> nice to finally have "perf" filter these FIB events based on netns.
No, by netns id I mean struct net::ns::inum, which is the inode number
associated with the netns. This is convenient since it's easy to look up this
value in userspace with the lsns tool or just stat through the procfs for the
inode.
Looks like struct net::net_cookie is for similar purpose and can be used from
restricted context (e.g.: xdp/tc/cls eBPF progs) where rich context (struct net
for example) as in a fexit/fentry probe is not available.
>
> David, any objections?
>
> >
> > Best,
> > Ferenc
> >
> >
> > [0] https://elixir.bootlin.com/linux/v6.15-rc3/source/net/ipv6/route.c#L2221
> > [1]
> > https://elixir.bootlin.com/linux/v6.15-rc3/source/include/trace/events/fib.h
> > [2]
> > https://elixir.bootlin.com/linux/v6.14/source/include/trace/events/fib6.h
> > [3]
> > https://elixir.bootlin.com/linux/v6.15-rc3/source/include/net/ip_fib.h#L374
Powered by blists - more mailing lists