[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171012231839.upwbtco3h524mmui@ast-mbp>
Date: Thu, 12 Oct 2017 16:18:40 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Cong Wang <xiyou.wangcong@...il.com>
Cc: netdev@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>,
Hannes Frederic Sowa <hannes@...essinduktion.org>,
Brendan Gregg <brendan.d.gregg@...il.com>,
Neal Cardwell <ncardwell@...gle.com>
Subject: Re: [Patch net-next v2] tcp: add a tracepoint for
tcp_retransmit_skb()
On Thu, Oct 12, 2017 at 03:48:07PM -0700, Cong Wang wrote:
> We need a real-time notification for tcp retransmission
> for monitoring.
>
> Of course we could use ftrace to dynamically instrument this
> kernel function too, however we can't retrieve the connection
> information at the same time, for example perf-tools [1] reads
> /proc/net/tcp for socket details, which is slow when we have
> a lots of connections.
>
> Therefore, this patch adds a tracepoint for tcp_retransmit_skb()
> and exposes src/dst IP addresses and ports of the connection.
> This also makes it easier to integrate into perf.
>
> Note, I expose both IPv4 and IPv6 addresses at the same time:
> for a IPv4 socket, v4 mapped address is used as IPv6 addresses,
> for a IPv6 socket, LOOPBACK4_IPV6 is already filled by kernel.
> Also, add sk and skb pointers as they are useful for BPF.
>
> 1. https://github.com/brendangregg/perf-tools/blob/master/net/tcpretrans
>
> Cc: Eric Dumazet <edumazet@...gle.com>
> Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>
> Cc: Hannes Frederic Sowa <hannes@...essinduktion.org>
> Cc: Brendan Gregg <brendan.d.gregg@...il.com>
> Cc: Neal Cardwell <ncardwell@...gle.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@...il.com>
> ---
> include/trace/events/tcp.h | 68 ++++++++++++++++++++++++++++++++++++++++++++++
> net/core/net-traces.c | 1 +
> net/ipv4/tcp_output.c | 3 ++
> 3 files changed, 72 insertions(+)
> create mode 100644 include/trace/events/tcp.h
>
> diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
> new file mode 100644
> index 000000000000..749f93c542ab
> --- /dev/null
> +++ b/include/trace/events/tcp.h
> @@ -0,0 +1,68 @@
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM tcp
> +
> +#if !defined(_TRACE_TCP_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_TCP_H
> +
> +#include <linux/ipv6.h>
> +#include <linux/tcp.h>
> +#include <linux/tracepoint.h>
> +#include <net/ipv6.h>
> +
> +TRACE_EVENT(tcp_retransmit_skb,
> +
> + TP_PROTO(struct sock *sk, struct sk_buff *skb, int segs),
> +
> + TP_ARGS(sk, skb, segs),
> +
> + TP_STRUCT__entry(
> + __field(void *, skbaddr)
> + __field(void *, skaddr)
> + __field(__u16, sport)
> + __field(__u16, dport)
> + __array(__u8, saddr, 4)
> + __array(__u8, daddr, 4)
> + __array(__u8, saddr_v6, 16)
> + __array(__u8, daddr_v6, 16)
> + ),
...
> if (likely(!err)) {
> TCP_SKB_CB(skb)->sacked |= TCPCB_EVER_RETRANS;
> + trace_tcp_retransmit_skb(sk, skb, segs);
looks great to me, but why 'segs' is there?
It's unused.
Powered by blists - more mailing lists