[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090314082532.GB16436@elte.hu>
Date: Sat, 14 Mar 2009 09:25:32 +0100
From: Ingo Molnar <mingo@...e.hu>
To: Alexey Dobriyan <adobriyan@...il.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>,
Ying Han <yinghan@...gle.com>,
"Serge E. Hallyn" <serue@...ibm.com>, linux-api@...r.kernel.org,
containers@...ts.linux-foundation.org, hpa@...or.com,
linux-kernel@...r.kernel.org,
Dave Hansen <dave@...ux.vnet.ibm.com>, linux-mm@...ck.org,
viro@...iv.linux.org.uk, mpm@...enic.com,
Andrew Morton <akpm@...ux-foundation.org>, xemul@...nvz.org,
tglx@...utronix.de
Subject: Re: How much of a mess does OpenVZ make? ;) Was: What can OpenVZ
do?
* Alexey Dobriyan <adobriyan@...il.com> wrote:
> On Fri, Mar 13, 2009 at 02:01:50PM -0700, Linus Torvalds wrote:
> >
> >
> > On Fri, 13 Mar 2009, Alexey Dobriyan wrote:
> > > >
> > > > Let's face it, we're not going to _ever_ checkpoint any
> > > > kind of general case process. Just TCP makes that
> > > > fundamentally impossible in the general case, and there
> > > > are lots and lots of other cases too (just something as
> > > > totally _trivial_ as all the files in the filesystem
> > > > that don't get rolled back).
> > >
> > > What do you mean here? Unlinked files?
> >
> > Or modified files, or anything else. "External state" is a
> > pretty damn wide net. It's not just TCP sequence numbers and
> > another machine.
>
> I think (I think) you're seriously underestimating what's
> doable with kernel C/R and what's already done.
>
> I was told (haven't seen it myself) that Oracle installations
> and Counter Strike servers were moved between boxes just fine.
>
> They were run in specially prepared environment of course, but
> still.
That's the kind of stuff i'd like to see happen.
Right now the main 'enterprise' approach to do
migration/consolidation of server contexts is based on hardware
virtualization - but that pushes runtime overhead to the native
kernel and slows down the guest context as well - massively so.
Before we've blinked twice it will be a 'required' enterprise
feature and enterprise people will measure/benchmark Linux
server performance in guest context primarily and we'll have a
deep performance pit to dig ourselves out of.
We can ignore that trend as uninteresting (it is uninteresting
in a number of ways because it is partly driven by stupidity),
or we can do something about it while still advancing the
kernel.
With containers+checkpointing the code is a lot scarier (we
basically do system call virtualization), the environment
interactions are a lot wider and thus they are a lot more
difficult to handle - but it's all a lot faster as well, and
conceptually so. All the runtime overhead is pushed to the
checkpointing step - (with some minimal amount of data structure
isolation overhead).
I see three conceptual levels of virtualization:
- hardware based virtualization, for 'unaware OSs'
- system call based virtualization, for 'unaware software'
- no virtualization kernel help is needed _at all_ to
checkpoint 'aware' software. We have libraries to checkpoint
'aware' user-space just fine - and had them for a decade.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists