[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4CE3F773.2010303@kernel.org>
Date: Wed, 17 Nov 2010 16:40:35 +0100
From: Tejun Heo <tj@...nel.org>
To: Dan Smith <danms@...ibm.com>
CC: Gene Cooperman <gene@....neu.edu>,
Oren Laadan <orenl@...columbia.edu>,
Kapil Arya <kapil@....neu.edu>,
ksummit-2010-discuss@...ts.linux-foundation.org,
linux-kernel@...r.kernel.org, hch@....de
Subject: Re: [Ksummit-2010-discuss] checkpoint-restart: naked patch
Hello,
On 11/17/2010 04:33 PM, Dan Smith wrote:
> TH> If it ever becomes a general enough problem (which I extremely
> TH> strongly doubt),
>
> Migration of a container? Yeah, it's one of the primary reasons for
> doing what we're doing :)
Well, then push for the feature. If the rationale is strong enough,
it'll get in.
> TH> we can think about allowing processes in a netns to change
> TH> sequence number but that would be a single setsockopt option
>
> Yeah, well there's more than that, of course, if you want to be able
> to checkpoint a socket in any state. Buffers, time-wait, etc.
I haven't really thought about it too deeply but for all other misc
states, you should be able to emulate it by talking to a netfilter
module. The reason why I suggested sequence number changing setsocket
option is because that is the only performance sensitive part and with
that you should be able to resume live sockets without conntracking.
For cold paths, using netfilter module during resume should do, right?
> TH> instead of the horror show of dumping in-kernel data structures in
> TH> binary blob.
>
> Well, as should be evident from a review of the code, we don't dump
> binary kernel data structures as a general rule. We canonicalize them
> into checkpoint headers on the way out and build the new data
> structures (or use existing kernel interfaces to do so) on the way in.
> You know, just like netlink does.
netlink interaction is defined by ABI.
> It has even been suggested that we do this with netlink instead, to
> mirror the other "horror show" tools that we all use on a daily basis.
> We're not opposed to this, but we do have some concerns about
> performance.
The horror show part is dumping internal data structure without due
scrutinization in a way which can only ever be useful for CR when most
of the same states are already exported via ABI defined ways.
Thanks.
--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists