[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZeBGy0Wt2rmR0j74@bzorp2>
Date: Thu, 29 Feb 2024 09:56:43 +0100
From: Balazs Scheidler <bazsi77@...il.com>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: balazs.scheidler@...flow.com, netdev@...r.kernel.org
Subject: Re: [PATCH net-next 2/2] net: udp: add IP/port data to the
tracepoint udp/udp_fail_queue_rcv_skb
On Thu, Feb 29, 2024 at 12:27:11AM -0800, Kuniyuki Iwashima wrote:
> From: Balazs Scheidler <bazsi77@...il.com>
> Date: Thu, 29 Feb 2024 08:38:00 +0100
> > The udp_fail_queue_rcv_skb() tracepoint lacks any details on the source
> > and destination IP/port whereas this information can be critical in case
> > of UDP/syslog.
> >
> > Signed-off-by: Balazs Scheidler <balazs.scheidler@...flow.com>
> > ---
> > include/trace/events/udp.h | 33 +++++++++++++++++++++++++++++----
> > net/ipv4/udp.c | 2 +-
> > net/ipv6/udp.c | 3 ++-
> > 3 files changed, 32 insertions(+), 6 deletions(-)
> >
> > diff --git a/include/trace/events/udp.h b/include/trace/events/udp.h
> > index 336fe272889f..cd4ae5c2fad7 100644
> > --- a/include/trace/events/udp.h
> > +++ b/include/trace/events/udp.h
> > @@ -7,24 +7,49 @@
> >
> > #include <linux/udp.h>
> > #include <linux/tracepoint.h>
> > +#include <trace/events/net_probe_common.h>
> >
> > TRACE_EVENT(udp_fail_queue_rcv_skb,
> >
> > - TP_PROTO(int rc, struct sock *sk),
> > + TP_PROTO(int rc, struct sock *sk, struct sk_buff *skb),
> >
> > - TP_ARGS(rc, sk),
> > + TP_ARGS(rc, sk, skb),
> >
> > TP_STRUCT__entry(
> > __field(int, rc)
> > __field(__u16, lport)
> > +
> > + __field(__u16, sport)
> > + __field(__u16, dport)
>
> duplicating lport just for reusing TP_STORE_ADDR_PORTS_SKB() ?
> Then, I think we should define udp-specific macro.
I left "lport" in for compatibility with the previous users of the same
tracepoint.
The sk->inet_num can be different from the dport in the packet, for instance when
TPROXY style redirects are done. In that case the TPROXY rule would look up
a socket and queue the incoming packet there, even if their port numbers
differ.
If I was adding this tracepoint now, I would probably skip the "lport"
value, I agree this is redundant. But there are other issues as well:
* I think that the name "lport" is confusing, all other tracepoints have saddr/daddr sport/dport.
* Sometimes address information is stored stored as separate fields (e.g.
family, saddr/daddr, sport/dport), in other cases they are stored as "struct sockaddr"
that bundles family, address/port information.
With that said, I could either go ahead and:
1) break tracepoint compatibility by removing lport completely
2) change the interpretation of lport and store the dport in that field
(still incompatible, but would not break stuff), this would leave the
confusing lport name.
3) retain lport and add the redundant fields (compatible at a cost of an
extra u16 field)
I've chosen number 3) above, as it fully retains compatibility and sometimes
the extra lport field can come handy in case someone is using TPROXY with
UDP.
>
>
> > + __field(__u16, family)
> > + __array(__u8, saddr, sizeof(struct sockaddr_in6))
> > + __array(__u8, daddr, sizeof(struct sockaddr_in6))
> > ),
> >
> > TP_fast_assign(
> > + const struct inet_sock *inet = inet_sk(sk);
> > + const struct udphdr *uh = (const struct udphdr *)udp_hdr(skb);
> > + __be32 *p32;
> > +
> > __entry->rc = rc;
> > - __entry->lport = inet_sk(sk)->inet_num;
> > + __entry->lport = inet->inet_num;
> > +
> > + __entry->sport = ntohs(uh->source);
> > + __entry->dport = ntohs(uh->dest);
> > + __entry->family = sk->sk_family;
> > +
> > + p32 = (__be32 *) __entry->saddr;
> > + *p32 = inet->inet_saddr;
> > +
> > + p32 = (__be32 *) __entry->daddr;
> > + *p32 = inet->inet_daddr;
>
> nit: double space here.
Thanks, I fixed this.
>
>
> > +
> > + TP_STORE_ADDR_PORTS_SKB(__entry, skb, uh);
> > ),
> >
> > - TP_printk("rc=%d port=%hu", __entry->rc, __entry->lport)
> > + TP_printk("rc=%d port=%hu family=%s src=%pISpc dest=%pISpc", __entry->rc, __entry->lport,
> > + show_family_name(__entry->family),
> > + __entry->saddr, __entry->daddr)
> > );
> >
> > #endif /* _TRACE_UDP_H */
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index a8acea17b4e5..d21a85257367 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -2051,8 +2051,8 @@ static int __udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
> > drop_reason = SKB_DROP_REASON_PROTO_MEM;
> > }
> > UDP_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
> > + trace_udp_fail_queue_rcv_skb(rc, sk, skb);
> > kfree_skb_reason(skb, drop_reason);
> > - trace_udp_fail_queue_rcv_skb(rc, sk);
> > return -1;
> > }
> >
> > diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> > index 3f2249b4cd5f..e5a52c4c934c 100644
> > --- a/net/ipv6/udp.c
> > +++ b/net/ipv6/udp.c
> > @@ -34,6 +34,7 @@
> > #include <linux/slab.h>
> > #include <linux/uaccess.h>
> > #include <linux/indirect_call_wrapper.h>
> > +#include <trace/events/udp.h>
> >
> > #include <net/addrconf.h>
> > #include <net/ndisc.h>
> > @@ -661,8 +662,8 @@ static int __udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
> > drop_reason = SKB_DROP_REASON_PROTO_MEM;
> > }
> > UDP6_INC_STATS(sock_net(sk), UDP_MIB_INERRORS, is_udplite);
> > + trace_udp_fail_queue_rcv_skb(rc, sk, skb);
> > kfree_skb_reason(skb, drop_reason);
> > - trace_udp_fail_queue_rcv_skb(rc, sk);
> > return -1;
> > }
> >
> > --
> > 2.40.1
Powered by blists - more mailing lists