[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F3B3A14.7000305@parallels.com>
Date: Wed, 15 Feb 2012 08:52:36 +0400
From: Pavel Emelyanov <xemul@...allels.com>
To: Andrew Morton <akpm@...ux-foundation.org>
CC: Cyrill Gorcunov <gorcunov@...nvz.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
Stanislav Kinsbursky <skinsbursky@...allels.com>,
James Bottomley <jbottomley@...allels.com>
Subject: Re: [patch 0/4] Resending, c/r series v2
On 02/15/2012 02:51 AM, Andrew Morton wrote:
> On Mon, 13 Feb 2012 20:48:22 +0400
> Cyrill Gorcunov <gorcunov@...nvz.org> wrote:
>
>> Hi, this series hopefully in a good shape
>>
>> - sys_kcmp now depends on CONFIG_CHECKPOINT_RESTORE
>>
>> - the extension of /proc/pid/stat now done against
>> linux-next/master
>>
>> Please letme know if I've missed something.
>
> Thus far our (my) approach has been to trickle the c/r support code
> into mainline as it is developed. Under the assumption that the end
> result will be acceptable and useful kernel code.
>
> I'm afraid that I'm losing confidence in that approach. We have this
> patchset, we have Stanislav's "IPC: checkpoint/restore in userspace
> enhancements" (which apparently needs to get more complex to support
> LSM context c/r). I simply *don't know* what additional patchsets are
> expected. And from what you told me it sounds like networking support
> is at a very early stage and I fear for what the end result of that
> will look like.
I understand. But there was a confidence that nobody wanted the c/r stuff to
be the "one big kernel subsystem", but it should rather be "a bunch of small
API-s for what is required". The amount of code for the initial C/R attempt was
~100 patches. The amount of code to support our user-space C/R implementation
*only* is ~10 and the feature-set of both is already comparable.
As far as the networking is concerned -- we will not require any additional
patches to implement the basic netns configuration migration (ip can show and
re-configure all we need about routing, interfaces, devices, etc. and the
iptables-save/iptables-restore will handle 99.9% of the netfilter part). For
what we currently need is the ability to explore sockets queues, but currently
this doesn't turn out to be a lot of code -- I have 60-lines patch for unix
sockets and Tejun showed the way how to do the same with TCP using 130 lines
of code. UDP won't require anything, its queues can be silently dropped. The
recent 50 patches with *_diag stuff doesn't count, because it works not for C/R
only, the ss tool can benefit from 100% of the added functionality (this, btw,
shows that not every piece of code we add for C/R is for C/R *only*).
> So I don't feel that I can continue feeding these things into mainline
> until someone can convince me that we won't have a nasty mess (and/or
> an unsufficiently useful feature) at the end of the project.
Isn't the CONFIG_CHECKPOINT_RESTORE option turned off by default enough?
> The traditional approach is to develop the feature out-of-tree until it
> is "finished". That's a lot more hackwork for you guys and it leads to
> a poorer feature - this approach inevitably has a lower level of review
> and inhibits code rework.
That's why we started sending patches early.
> An alternative is for me to buffer the patches in my tree until it is
> all sufficiently finished. That also is more work for your team, but
> it will produce better code, because of additional review and code
> rework resulting from that review.
>
> I don't know how many patches that would end up being (this is part of
> the problem!) nor how long they would be carried for.
Neither do I :(
> So. Please talk to me. How long is this all going to take, and what
> will the final result look like?
The Big Intermediate Result we're trying to achieve is -- take a basic
OpenVZ or LXC container based on e.g. rhel6 template and make sure we can
checkpoint and restore it without breaking one.
The More-or-less Finished state of the project would be when it's able to
do all the stuff that the OpenVZ's implementation can. The list of major
features which are yet absent in the CRIU and for which we will require the
kernel support includes
* shared kernel objects (this thread)
* tcp connection
* pty stuff
* sysvipc
* iterative working set migration
The latter one is an ability to find out which pages processes use and catch
when they change data on them. I planned to discuss this on LSF, but we can
start earlier if you want.
Other currently missing stuff is quite minor or doesn't require any new things
form the kernel like signalfd-s or netfilter.
The Ultimate Goal is hard to describe because we have the variety of ideas
about what the CRIU can do including such things as checkpointing desktop apps'
with their xserver state or live-migrating parts of a multi-process app from
one box to another.
Thanks,
Pavel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists