[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090224044752.GB3202@x200.localdomain>
Date: Tue, 24 Feb 2009 07:47:52 +0300
From: Alexey Dobriyan <adobriyan@...il.com>
To: Dave Hansen <dave@...ux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@...e.hu>, Nathan Lynch <nathanl@...tin.ibm.com>,
linux-api@...r.kernel.org, containers@...ts.linux-foundation.org,
mpm@...enic.com, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
viro@...iv.linux.org.uk, hpa@...or.com,
Andrew Morton <akpm@...ux-foundation.org>,
torvalds@...ux-foundation.org, tglx@...utronix.de, xemul@...nvz.org
Subject: Re: Banning checkpoint (was: Re: What can OpenVZ do?)
On Thu, Feb 19, 2009 at 11:11:54AM -0800, Dave Hansen wrote:
> On Thu, 2009-02-19 at 22:06 +0300, Alexey Dobriyan wrote:
> > Inotify isn't supported yet? You do
> >
> > if (!list_empty(&inode->inotify_watches))
> > return -E;
> >
> > without hooking into inotify syscalls.
> >
> > ptrace(2) isn't supported -- look at struct task_struct::ptraced and
> > friends.
> >
> > And so on.
> >
> > System call (or whatever) does something with some piece of kernel
> > internals. We look at this "something" when walking data structures
> > and
> > abort if it's scary enough.
> >
> > Please, show at least one counter-example.
>
> Alexey, I agree with you here. I've been fighting myself internally
> about these two somewhat opposing approaches. Of *course* we can
> determine the "checkpointability" at sys_checkpoint() time by checking
> all the various bits of state.
>
> The problem that I think Ingo is trying to address here is that doing it
> then makes it hard to figure out _when_ you went wrong. That's the
> single most critical piece of finding out how to go address it.
>
> I see where you are coming from. Ingo's suggestion has the *huge*
> downside that we've got to go muck with a lot of generic code and hook
> into all the things we don't support.
>
> I think what I posted is a decent compromise. It gets you those
> warnings at runtime and is a one-way trip for any given process. But,
> it does detect in certain cases (fork() and unshare(FILES)) when it is
> safe to make the trip back to the "I'm checkpointable" state again.
"Checkpointable" is not even per-process property.
Imagine, set of SAs (struct xfrm_state) and SPDs (struct xfrm_policy).
They are a) per-netns, b) persistent.
You can hook into socketcalls to mark process as uncheckpointable,
but since SAs and SPDs are persistent, original process already exited.
You're going to walk every process with same netns as SA adder and mark
it as uncheckpointable. Definitely doable, but ugly, isn't it?
Same for iptable rules.
"Checkpointable" is container property, OK?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists