lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Wed, 29 May 2024 11:31:04 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: edumazet@...gle.com,
	dsahern@...nel.org,
	kuba@...nel.org,
	pabeni@...hat.com,
	davem@...emloft.net
Cc: netdev@...r.kernel.org,
	kerneljasonxing@...il.com,
	Jason Xing <kernelxing@...cent.com>,
	Yongming Liu <yomiliu@...cent.com>,
	Wangzi Yong <curuwang@...cent.com>
Subject: [PATCH v2 net-next] tcp: introduce a new MIB for CLOSE-WAIT sockets

From: Jason Xing <kernelxing@...cent.com>

CLOSE-WAIT is a relatively special state which "represents waiting for
a connection termination request from the local user" (RFC 793). Some
issues may happen because of unexpected/too many CLOSE-WAIT sockets,
like user application mistakenly handling close() syscall. It's a very
common issue in the real world.

We want to trace this total number of CLOSE-WAIT sockets fastly and
frequently instead of resorting to displaying them altogether by using:

  ss -s state close-wait

or something like this. They need to loop and collect required socket
information in kernel and then get back to the userside for print, which
does harm to the performance especially in heavy load for frequent
sampling.

That's the reason why I chose to introduce this new MIB counter like
CurrEstab does. With this counter implemented, we can record/observe the
normal changes of this counter all the time. It can help us:
1) We are able to be alerted in advance if the counter changes drastically.
2) If some users report some issues happening, we will particularly
pay more attention to it.

Besides, in the group of TCP_MIB_* defined by RFC 1213, TCP_MIB_CURRESTAB
should include both ESTABLISHED and CLOSE-WAIT sockets in theory:

  "tcpCurrEstab OBJECT-TYPE
   ...
   The number of TCP connections for which the current state
   is either ESTABLISHED or CLOSE- WAIT."

However, the thing is we don't do that according to RFC. The reason is
unknown. At least since 2005, we should have counted CLOSE-WAIT sockets
I think, there is a need to finish the work as RFC says. It's definitely
not a bad thing.

After this patch, we can see the counter by running 'cat /proc/net/netstat'
or 'nstat -s | grep CloseWait'

Suggested-by: Yongming Liu <yomiliu@...cent.com>
Suggested-by: Wangzi Yong <curuwang@...cent.com>
Signed-off-by: Jason Xing <kernelxing@...cent.com>
---
v2
Link: https://lore.kernel.org/all/20240528021149.6186-1-kerneljasonxing@gmail.com/
1. revise the commit message to let other developers know what the use of
such a new counter.
2. introduce a decrement-counter help so that this new counter can do the
same thing as CurrEstab does. It's also the same as what I implemented locally.

BTW, I just finish implementing the correct snmp based on the RFC. Is it really so
hard to count close-wait sockets? I wondered... Any suggestions are welcome.

Thanks in advance.
---
 include/net/ip.h          | 1 +
 include/uapi/linux/snmp.h | 1 +
 net/ipv4/proc.c           | 1 +
 net/ipv4/tcp.c            | 5 +++++
 4 files changed, 8 insertions(+)

diff --git a/include/net/ip.h b/include/net/ip.h
index 6d735e00d3f3..0fe2994796b0 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -298,6 +298,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
 #define IP_UPD_PO_STATS(net, field, val) SNMP_UPD_PO_STATS64((net)->mib.ip_statistics, field, val)
 #define __IP_UPD_PO_STATS(net, field, val) __SNMP_UPD_PO_STATS64((net)->mib.ip_statistics, field, val)
 #define NET_INC_STATS(net, field)	SNMP_INC_STATS((net)->mib.net_statistics, field)
+#define NET_DEC_STATS(net, field)	SNMP_DEC_STATS((net)->mib.net_statistics, field)
 #define __NET_INC_STATS(net, field)	__SNMP_INC_STATS((net)->mib.net_statistics, field)
 #define NET_ADD_STATS(net, field, adnd)	SNMP_ADD_STATS((net)->mib.net_statistics, field, adnd)
 #define __NET_ADD_STATS(net, field, adnd) __SNMP_ADD_STATS((net)->mib.net_statistics, field, adnd)
diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index adf5fd78dd50..c0feefb4d88b 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -302,6 +302,7 @@ enum
 	LINUX_MIB_TCPAOKEYNOTFOUND,		/* TCPAOKeyNotFound */
 	LINUX_MIB_TCPAOGOOD,			/* TCPAOGood */
 	LINUX_MIB_TCPAODROPPEDICMPS,		/* TCPAODroppedIcmps */
+	LINUX_MIB_TCPCLOSEWAIT,			/* TCPCloseWait */
 	__LINUX_MIB_MAX
 };
 
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index 6c4664c681ca..964897dc6eb8 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -305,6 +305,7 @@ static const struct snmp_mib snmp4_net_list[] = {
 	SNMP_MIB_ITEM("TCPAOKeyNotFound", LINUX_MIB_TCPAOKEYNOTFOUND),
 	SNMP_MIB_ITEM("TCPAOGood", LINUX_MIB_TCPAOGOOD),
 	SNMP_MIB_ITEM("TCPAODroppedIcmps", LINUX_MIB_TCPAODROPPEDICMPS),
+	SNMP_MIB_ITEM("TCPCloseWait", LINUX_MIB_TCPCLOSEWAIT),
 	SNMP_MIB_SENTINEL
 };
 
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 681b54e1f3a6..3908ea7fd14a 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2661,6 +2661,11 @@ void tcp_set_state(struct sock *sk, int state)
 			TCP_DEC_STATS(sock_net(sk), TCP_MIB_CURRESTAB);
 	}
 
+	if (state == TCP_CLOSE_WAIT)
+		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPCLOSEWAIT);
+	if (oldstate == TCP_CLOSE_WAIT)
+		NET_DEC_STATS(sock_net(sk), LINUX_MIB_TCPCLOSEWAIT);
+
 	/* Change state AFTER socket is unhashed to avoid closed
 	 * socket sitting in hash tables.
 	 */
-- 
2.37.3


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ