[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <be0f5d0d-479e-46a3-9e9c-ebcd0b1987e9@linux.dev>
Date: Wed, 22 Jan 2025 10:40:20 -0800
From: Yonghong Song <yonghong.song@...ux.dev>
To: Breno Leitao <leitao@...ian.org>, Steven Rostedt <rostedt@...dmis.org>
Cc: Jason Xing <kerneljasonxing@...il.com>, Eric Dumazet
<edumazet@...gle.com>, Masami Hiramatsu <mhiramat@...nel.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
"David S. Miller" <davem@...emloft.net>, David Ahern <dsahern@...nel.org>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
kernel-team@...a.com, Song Liu <song@...nel.org>,
Martin KaFai Lau <martin.lau@...nel.org>
Subject: Re: [PATCH RFC net-next] trace: tcp: Add tracepoint for
tcp_cwnd_reduction()
On 1/22/25 1:39 AM, Breno Leitao wrote:
> Hello Steven,
>
> On Mon, Jan 20, 2025 at 10:03:40AM -0500, Steven Rostedt wrote:
>> On Mon, 20 Jan 2025 05:20:05 -0800
>> Breno Leitao <leitao@...ian.org> wrote:
>>
>>> This patch enhances the API's stability by introducing a guaranteed hook
>>> point, allowing the compiler to make changes without disrupting the
>>> BPF program's functionality.
>> Instead of using a TRACE_EVENT() macro, you can use DECLARE_TRACE()
>> which will create the tracepoint in the kernel, but will not create a
>> trace event that is exported to the tracefs file system. Then BPF could
>> hook to it and it will still not be exposed as an user space API.
> Right, DECLARE_TRACE would solve my current problem, but, a056a5bed7fa
> ("sched/debug: Export the newly added tracepoints") says "BPF doesn't
> have infrastructure to access these bare tracepoints either.".
>
> Does BPF know how to attach to this bare tracepointers now?
>
> On the other side, it seems real tracepoints is getting more pervasive?
> So, this current approach might be OK also?
>
> https://lore.kernel.org/bpf/20250118033723.GV1977892@ZenIV/T/#m4c2fb2d904e839b34800daf8578dff0b9abd69a0
>
>> You can see its use in include/trace/events/sched.h
> I suppose I need to export the tracepointer with
> EXPORT_TRACEPOINT_SYMBOL_GPL(), right?
>
> I am trying to hack something as the following, but, I struggled to hook
> BPF into it.
>
> Thank you!
> --breno
>
> Author: Breno Leitao <leitao@...ian.org>
> Date: Fri Jan 17 09:26:22 2025 -0800
>
> trace: tcp: Add tracepoint for tcp_cwnd_reduction()
>
> Add a lightweight tracepoint to monitor TCP congestion window
> adjustments via tcp_cwnd_reduction(). This tracepoint enables tracking
> of:
> - TCP window size fluctuations
> - Active socket behavior
> - Congestion window reduction events
>
> Meta has been using BPF programs to monitor this function for years.
> Adding a proper tracepoint provides a stable API for all users who need
> to monitor TCP congestion window behavior.
>
> Use DECLARE_TRACE instead of TRACE_EVENT to avoid creating trace event
> infrastructure and exporting to tracefs, keeping the implementation
> minimal. (Thanks Steven Rostedt)
>
> Signed-off-by: Breno Leitao <leitao@...ian.org>
>
> diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
> index a27c4b619dffd..07add3e20931a 100644
> --- a/include/trace/events/tcp.h
> +++ b/include/trace/events/tcp.h
> @@ -259,6 +259,11 @@ TRACE_EVENT(tcp_retransmit_synack,
> __entry->saddr_v6, __entry->daddr_v6)
> );
>
> +DECLARE_TRACE(tcp_cwnd_reduction_tp,
> + TP_PROTO(const struct sock *sk, const int newly_acked_sacked,
> + const int newly_lost, const int flag),
I don't think we need 'const' for int types. For 'const strcut sock *',
it makes sense since we do not want sk-><fields> get changed.
> + TP_ARGS(sk, newly_acked_sacked, newly_lost, flag));
> +
> #include <trace/events/net_probe_common.h>
>
> TRACE_EVENT(tcp_probe,
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 4811727b8a022..74cf8dbbedaa0 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -2710,6 +2710,8 @@ void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked, int newly_lost,
> if (newly_acked_sacked <= 0 || WARN_ON_ONCE(!tp->prior_cwnd))
> return;
>
> + trace_tcp_cwnd_reduction_tp(sk, newly_acked_sacked, newly_lost, flag);
> +
> tp->prr_delivered += newly_acked_sacked;
> if (delta < 0) {
> u64 dividend = (u64)tp->snd_ssthresh * tp->prr_delivered +
> @@ -2726,6 +2728,7 @@ void tcp_cwnd_reduction(struct sock *sk, int newly_acked_sacked, int newly_lost,
> sndcnt = max(sndcnt, (tp->prr_out ? 0 : 1));
> tcp_snd_cwnd_set(tp, tcp_packets_in_flight(tp) + sndcnt);
> }
> +EXPORT_TRACEPOINT_SYMBOL_GPL(tcp_cwnd_reduction_tp);
>
> static inline void tcp_end_cwnd_reduction(struct sock *sk)
> {
Powered by blists - more mailing lists