[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <201201230830.08993.hans.schillstrom@ericsson.com>
Date: Mon, 23 Jan 2012 08:30:08 +0100
From: Hans Schillstrom <hans.schillstrom@...csson.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
CC: Hagen Paul Pfeifer <hagen@...u.net>,
David Miller <davem@...emloft.net>,
"equinox@...c24.net" <equinox@...c24.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: RFC Hanging clean-up of a namespace
On Monday 23 January 2012 08:17:00 Eric W. Biederman wrote:
> Hans Schillstrom <hans.schillstrom@...csson.com> writes:
>
> > On Monday 23 January 2012 07:25:52 Eric W. Biederman wrote:
> >> Hans Schillstrom <hans.schillstrom@...csson.com> writes:
> >>
> >> > On Friday 20 January 2012 21:55:27 Eric W. Biederman wrote:
> >> >> My current hypothesis is that the namespace actually didn't get freed
> >> >> until the tcp socket finished closing. You can check by looking at when
> >> >> __put_net and then cleanup_net are called.
> >> >
> >> > __put_net() is called just after tcp_write_timer() fires and then
> >> > cleanup_net()
> >>
> >> Hypothesis confirmed. Your speed problem is that it is taking 2 minutes
> >> in the pathological case for your tcp socket to close.
> >>
> >> Do you have any clue why it is taking your sockets so long to close?
> >> Is the other side simply not responding?
> >>
> >
> > The root cause of death is that the other side (init_net namespace) dies first
> > and when it dies all containers will be killed ...
>
> ?????
>
> init_net can not die. init_net must not die. It makes no sense for the
> ref count on init_net to drop to 0. Among other places every kernel
> thread uses init_net. Furthermore the network namespaces are
> independent so even the impossible death of the initial network
> namespace should not kill a child network namespace.
>
> Did I misunderstand what you said?
Yes you did, sorry for my bad expl.
>
> If you have a setup where you stop being able to talk to the outside
> world because you were relaying through the initial network namespace
> and the relay through the initial network namespace stopped functioning
> that makes sense. So effectively all of your packets are being dropped
> on the floor the tcp retransmit behavior on closing a socket makes
> sense. I just don't get how you have triggered that state.
>
We have a process in root name space (init_net) that have
tcp connections into a number of containers (in the same machine)
If the control process in root ns dies all containers will also be killed
i.e. there can easilly be out standing messages to and from the containers.
So I guess that's why I see the tcp_write_timer()
--
Regards
Hans Schillstrom <hans.schillstrom@...csson.com>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists