[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABAhCORpd+1A6uThBQ_YYx16iLkPZDXs5vwTkYDNAxcN3epWDw@mail.gmail.com>
Date: Wed, 11 Dec 2024 14:35:16 +0800
From: Xiao Liang <shaw.leon@...il.com>
To: Jay Vosburgh <jv@...sburgh.net>
Cc: Cong Wang <xiyou.wangcong@...il.com>, dave seddon <dave.seddon.ca@...il.com>,
netdev@...r.kernel.org, Kuniyuki Iwashima <kuniyu@...zon.com>
Subject: Re: tcp_diag for all network namespaces?
On Wed, Dec 11, 2024 at 1:43 PM Jay Vosburgh <jv@...sburgh.net> wrote:
>
> Cong Wang <xiyou.wangcong@...il.com> wrote:
>
> >On Mon, Dec 09, 2024 at 11:24:18AM -0800, dave seddon wrote:
> >> G'day,
> >>
> >> Short
> >> Is there a way to extract tcp_diag socket data for all sockets from
> >> all network name spaces please?
> >>
> >> Background
> >> I've been using tcp_diag to dump out TCP socket performance every
> >> minute and then stream the data via Kafka and then into a Clickhouse
> >> database. This is awesome for socket performance monitoring.
> >>
> >> Kubernetes
> >> I'd like to adapt this solution to <somehow> allow monitoring of
> >> kubernetes clusters, so that it would be possible to monitor the
> >> socket performance of all pods. Ideally, a single process could open
> >> a netlink socket into each network namespace, but currently that isn't
> >> possible.
> >>
> >> Would it be crazy to add a new feature to the kernel to allow dumping
> >> all sockets from all name spaces?
> >
> >You are already able to do so in user-space, something like:
> >
> >for ns in $(ip netns list | cut -d' ' -f1); do
> > ip netns exec $ns ss -tapn
> >done
> >
> >(If you use API, you can find equivalent API's)
>
> FWIW, if any namespaces weren't created through /sbin/ip, then
> something like the following works as well:
>
> #!/bin/bash
>
> nspidlist=`lsns -t net -o pid -n`
>
> for p in ${nspidlist}; do
> lsns -p ${p} -t net
> nsenter -n -t ${p} ss -tapn
> done
I think neither iproute2 nor lsns can actually list all net namespaces.
iproute2 uses mounts under /run/netns by default, and lsns iterates
through processes. But there are more ways to hold a reference to
netns: open fds, sockets, and files hidden in mnt namespaces...
Consider if we move an interface to a netns, and some process
creates a socket in that ns and switches back to init ns. Then when
we delete it with "ip netns delete", the interface and ns are lost from
userspace. It's hard to troubleshoot.
I haven't found a way to enumerate net namespaces reliably. Maybe
we can have an API to list namespaces in net_namespace_list, and
allow processes to open an ns file by inum?
>
> -J
>
> ---
> -Jay Vosburgh, jv@...sburgh.net
>
Powered by blists - more mailing lists