[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4AD4D66B.6090903@librato.com>
Date: Tue, 13 Oct 2009 15:35:07 -0400
From: Oren Laadan <orenl@...rato.com>
To: Dan Smith <danms@...ibm.com>
CC: containers@...ts.osdl.org, netdev@...r.kernel.org,
John Dykstra <jdykstra72@...il.com>
Subject: Re: [PATCH 2/2] [RFC] Add c/r support for connected INET sockets
Dan Smith wrote:
> OL> IIRC, the TCP stack takes the timestamp for each packet directly
> OL> from jiffies. So you need to teach TCP to add a per-container (or
> OL> you can make it per-socket) delta to that timestamp.
>
> After wondering what the heck you were talking about, I realized I
> assumed you were talking about TCP timeouts and not timestamps :)
>
> I assume you mean the following:
>
> 1. Put a absolute time stamp in the checkpoint stream
> 2. Calculate the delta between that and the current time on the
> restoring host
> 3. Use that value to offset timestamps from that point on.
>
> Correct?
Sort of. Right now we already record the absolute time-of-checkpoint:
ctx->ktime_begin. The restart-blocks are saved relative to it. I'd
suggest the same for all time related data from the network - save it
in the checkpoint image as delta's compared to checkpoint time.
At restart, the restart-blocks are restored relative to restart-time,
using the saved delta. That would work for the TCP timestamps too.
>
> Since I brought it up, do you agree that the retransmit timers should
> be canonicalized to time-after checkpoint values? It occurs to me
> that right now I restore a jiffies value on the receiving host which
> is guaranteed to be incorrect :)
As for TCP timeouts - I don't think they matter that much in the case
of live migration, whether the timeout after restart happens in the
saved delta relative to original checkpoint-time, or new restart-time.
The difference is likely to be subsecond to a few seconds at most,
not important for most use-cases, I'd think.
(If we are concerned about a TCP hickup due to a migration, there are
tricks to work around it that; timeouts and retransmits are not the
best way to go, because once you get there you already slowed down
TCP significantly).
Here, too, once we have time virtualization this can be revisited, to
allow the user to choose a policy how to use the deltas.
>
> OL> So I'm thinking, for both, do (1) put a big fat comment in the
> OL> code saying that sanity-tests are needed, and what for, and (2)
> OL> send a separate mail to the networking people with these two
> OL> scenarios and request comments ?
>
> Yeah, although I would hope that they're seeing this conversation and
> would chime in (hence the cc:netdev). Hopefully I don't have to
> disguise a separate email as non-C/R related to get past their
> filters! :)
>
> OL> For example, now, if a user wants to send a TCP packet with
> OL> arbitrary protocol parameters, he needs to use raw IP sockets,
> OL> which require privilege. Would restarting a connection with the
> OL> desired parameters become a way to bypass that restriction ?
> OL> (e.g. assume the user restarts while using the host's network
> OL> namespace).
>
> Um, yeah? I don't see much way around that if we're going to trust
> any of what is in the checkpoint stream. Perhaps we say CAP_NET_ADMIN
> is required to restart a live TCP connection?
>
I don't see much way around that either. My point is to bring the issue
to everyone's attention, and see what others say about it.
CAP_NET_ADMIN is one option. CAP_NET_RAW is another option ?
Oren.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists