[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89i+zbXNOJtxJjMDVKEFt2LnjSW9xGG71bMBRc_YimuqKLA@mail.gmail.com>
Date: Tue, 28 May 2024 09:36:32 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Jason Xing <kerneljasonxing@...il.com>
Cc: dsahern@...nel.org, kuba@...nel.org, pabeni@...hat.com,
davem@...emloft.net, netdev@...r.kernel.org,
Jason Xing <kernelxing@...cent.com>, Yongming Liu <yomiliu@...cent.com>,
Wangzi Yong <curuwang@...cent.com>
Subject: Re: [PATCH net-next] tcp: introduce a new MIB for CLOSE-WAIT sockets
On Tue, May 28, 2024 at 8:48 AM Jason Xing <kerneljasonxing@...il.com> wrote:
>
> Hello Eric,
>
> On Tue, May 28, 2024 at 1:13 PM Eric Dumazet <edumazet@...gle.com> wrote:
> >
> > On Tue, May 28, 2024 at 4:12 AM Jason Xing <kerneljasonxing@...il.com> wrote:
> > >
> > > From: Jason Xing <kernelxing@...cent.com>
> > >
> > > CLOSE-WAIT is a relatively special state which "represents waiting for
> > > a connection termination request from the local user" (RFC 793). Some
> > > issues may happen because of unexpected/too many CLOSE-WAIT sockets,
> > > like user application mistakenly handling close() syscall.
> > >
> > > We want to trace this total number of CLOSE-WAIT sockets fastly and
> > > frequently instead of resorting to displaying them altogether by using:
> > >
> > > netstat -anlp | grep CLOSE_WAIT
> >
> > This is horribly expensive.
>
> Yes.
>
> > Why asking af_unix and program names ?
> > You want to count some TCP sockets in a given state, right ?
> > iproute2 interface (inet_diag) can do the filtering in the kernel,
> > saving a lot of cycles.
> >
> > ss -t state close-wait
>
> Indeed, it is much better than netstat but not that good/fast enough
> if we've already generated a lot of sockets. This command is suitable
> for debug use, but not for frequent sampling, say, every 10 seconds.
> More than this, RFC 1213 defines CurrEstab which should also include
> close-wait sockets, but we don't have this one.
"we don't have this one."
You mean we do not have CurrEstab ?
That might be user space decision to not display it from nstat
command, in useless_number()
(Not sure why. If someone thought it was useless, then CLOSE_WAIT
count is even more useless...)
> I have no intention to
> change the CurrEstab in Linux because it has been used for a really
> long time. So I chose to introduce a new counter in linux mib
> definitions.
>
> >
> > >
> > > or something like this, which does harm to the performance especially in
> > > heavy load. That's the reason why I chose to introduce this new MIB counter
> > > like CurrEstab does. It do help us diagnose/find issues in production.
> > >
> > > Besides, in the group of TCP_MIB_* defined by RFC 1213, TCP_MIB_CURRESTAB
> > > should include both ESTABLISHED and CLOSE-WAIT sockets in theory:
> > >
> > > "tcpCurrEstab OBJECT-TYPE
> > > ...
> > > The number of TCP connections for which the current state
> > > is either ESTABLISHED or CLOSE- WAIT."
> > >
> > > Apparently, at least since 2005, we don't count CLOSE-WAIT sockets. I think
> > > there is a need to count it separately to avoid polluting the existing
> > > TCP_MIB_CURRESTAB counter.
> > >
> > > After this patch, we can see the counter by running 'cat /proc/net/netstat'
> > > or 'nstat -s | grep CloseWait'
> >
> > I find this counter quite not interesting.
> > After a few days of uptime, let say it is 52904523
> > What can you make of this value exactly ?
> > How do you make any correlation ?
>
> There are two ways of implementing this counter:
> 1) like the counters in 'linux mib definitions', we have to 'diff' the
> counter then we can know how many close-wait sockets generated in a
> certain period.
And what do you make of this raw information ?
if it is 10000 or 20000 in a 10-second period, what conclusion do you get ?
Receiving FIN packets is a fact of life, I see no reason to worry.
> 2) like what CurrEstab does, then we have to introduce a new helper
> (for example, NET_DEC_STATS) to decrement the counter if the state of
> the close-wait socket changes in tcp_set_state().
>
> After thinking more about your question, the latter is better because
> it can easily reflect the current situation, right? What do you think?
I think you are sending not tested patches.
I suggest you use your patches first, and send tests so that we can all see
the intent, how this is supposed to work in the first place.
That would save us time.
Thank you.
Powered by blists - more mailing lists