lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z+5zNB6FO51CwlMW@gmail.com>
Date: Thu, 3 Apr 2025 04:38:28 -0700
From: Breno Leitao <leitao@...ian.org>
To: David Ahern <dsahern@...nel.org>
Cc: Eric Dumazet <edumazet@...gle.com>,
	Neal Cardwell <ncardwell@...gle.com>,
	Kuniyuki Iwashima <kuniyu@...zon.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Masami Hiramatsu <mhiramat@...nel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	"David S. Miller" <davem@...emloft.net>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Simon Horman <horms@...nel.org>, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
	kernel-team@...a.com, yonghong.song@...ux.dev
Subject: Re: [PATCH net-next] trace: tcp: Add tracepoint for tcp_sendmsg()

On Wed, Apr 02, 2025 at 08:11:03AM -0600, David Ahern wrote:
> On 4/2/25 8:57 AM, Breno Leitao wrote:
> > diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
> > index 1a40c41ff8c30..cd90a8c66d683 100644
> > --- a/include/trace/events/tcp.h
> > +++ b/include/trace/events/tcp.h
> > @@ -259,6 +259,29 @@ TRACE_EVENT(tcp_retransmit_synack,
> >  		  __entry->saddr_v6, __entry->daddr_v6)
> >  );
> >  
> > +TRACE_EVENT(tcp_sendmsg_locked,
> > +	TP_PROTO(struct msghdr *msg, struct sk_buff *skb, int size_goal),
> 
> How about passing in the sk reference here; not needed for trace
> entries, but makes it directly accessible for bpf programs.

Right, so, without the trace entries, the output of
/sys/kernel/debug/tracing/events/tcp/tcp_sendmsg_locked/format
doesn't show it, right?

	field:__u64 skb;	offset:8;	size:8;	signed:0;
	field:int skb_len;	offset:16;	size:4;	signed:1;
	field:int msg_left;	offset:20;	size:4;	signed:1;
	field:int size_goal;	offset:24;	size:4;	signed:1;


This is what I've been testing now:

	trace: tcp: Add tracepoint for tcp_sendmsg_locked()
	
	Add a tracepoint to monitor TCP sendmsg operations, enabling the tracing
	of TCP messages being sent.
	
	Meta has been using BPF programs to monitor tcp_sendmsg() for years,
	indicating significant interest in observing this important
	functionality. Adding a proper tracepoint provides a stable API for all
	users who need visibility into TCP message transmission.
	
	David Ahern is using a similar functionality with a custom patch[1]. So,
	this means we have more than a single use case for this request.
	
	The implementation adopts David's approach[1] for greater flexibility
	compared to the initial proposal.
	
	Link: https://lore.kernel.org/all/70168c8f-bf52-4279-b4c4-be64527aa1ac@kernel.org/ [1]
	Signed-off-by: Breno Leitao <leitao@...ian.org>

	diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
	index 1a40c41ff8c30..21529d5faf3b1 100644
	--- a/include/trace/events/tcp.h
	+++ b/include/trace/events/tcp.h
	@@ -259,6 +259,30 @@ TRACE_EVENT(tcp_retransmit_synack,
			__entry->saddr_v6, __entry->daddr_v6)
	);
	
	+TRACE_EVENT(tcp_sendmsg_locked,
	+	TP_PROTO(const struct sock *sk, const struct msghdr *msg,
	+		 const struct sk_buff *skb, int size_goal),
	+
	+	TP_ARGS(sk, msg, skb, size_goal),
	+
	+	TP_STRUCT__entry(
	+		__field(__u64, skb)
	+		__field(int, skb_len)
	+		__field(int, msg_left)
	+		__field(int, size_goal)
	+	),
	+
	+	TP_fast_assign(
	+		__entry->skb = (__u64)skb;
	+		__entry->skb_len = skb ? skb->len : 0;
	+		__entry->msg_left = msg_data_left(msg);
	+		__entry->size_goal = size_goal;
	+	),
	+
	+	TP_printk("skb %llx skb_len %d msg_left %d size_goal %d", __entry->skb,
	+		__entry->skb_len, __entry->msg_left, __entry->size_goal)
	+);
	+
	DECLARE_TRACE(tcp_cwnd_reduction_tp,
		TP_PROTO(const struct sock *sk, int newly_acked_sacked,
			int newly_lost, int flag),
	diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
	index ea8de00f669d0..270ce2c8c2d54 100644
	--- a/net/ipv4/tcp.c
	+++ b/net/ipv4/tcp.c
	@@ -1160,6 +1160,8 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
			if (skb)
				copy = size_goal - skb->len;
	
	+		trace_tcp_sendmsg_locked(sk, msg, skb, size_goal);
	+
			if (copy <= 0 || !tcp_skb_can_collapse_to(skb)) {
				bool first_skb;


Plus this small dependency:

	net: pass const to msg_data_left()

	The msg_data_left() function doesn't modify the struct msghdr parameter,
	so mark it as const. This allows the function to be used with const
	references, improving type safety and making the API more flexible.

	Signed-off-by: Breno Leitao <leitao@...ian.org>

	diff --git a/include/linux/socket.h b/include/linux/socket.h
	index c3322eb3d6865..3b262487ec060 100644
	--- a/include/linux/socket.h
	+++ b/include/linux/socket.h
	@@ -168,7 +168,7 @@ static inline struct cmsghdr * cmsg_nxthdr (struct msghdr *__msg, struct cmsghdr
		return __cmsg_nxthdr(__msg->msg_control, __msg->msg_controllen, __cmsg);
	}

	-static inline size_t msg_data_left(struct msghdr *msg)
	+static inline size_t msg_data_left(const struct msghdr *msg)
	{
		return iov_iter_count(&msg->msg_iter);
	}

Thanks for your help,
Breno

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ