[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090310215305.GA2078@x200.localdomain>
Date: Wed, 11 Mar 2009 00:53:05 +0300
From: Alexey Dobriyan <adobriyan@...il.com>
To: Dave Hansen <dave@...ux.vnet.ibm.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, mpm@...enic.com,
containers@...ts.linux-foundation.org, hpa@...or.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
viro@...iv.linux.org.uk, linux-api@...r.kernel.org, mingo@...e.hu,
torvalds@...ux-foundation.org, tglx@...utronix.de, xemul@...nvz.org
Subject: Re: How much of a mess does OpenVZ make? ;) Was: What can OpenVZ
do?
On Thu, Feb 26, 2009 at 06:57:55PM +0300, Alexey Dobriyan wrote:
> On Thu, Feb 12, 2009 at 03:04:05PM -0800, Dave Hansen wrote:
> > dave@...itz:~/kernels/linux-2.6-openvz$ git diff v2.6.27.10... kernel/cpt/ | diffstat
> > 47 files changed, 20702 insertions(+)
> >
> > One important thing that leaves out is the interaction that this code
> > has with the rest of the kernel. That's critically important when
> > considering long-term maintenance, and I'd be curious how the OpenVZ
> > folks view it.
>
> OpenVZ as-is in some cases wants some functions to be made global
> (and if C/R code will be modular, exported). Or probably several
> iterators added.
>
> But it's negligible amount of changes compared to main code.
Here is what C/R code wants from pid allocator.
With the introduction of hierarchical PID namespaces, struct pid can
have not one but many numbers -- tuple (pid_0, pid_1, ..., pid_N),
where pid_i is pid number in pid_ns which has level i.
Now root pid_ns of container has level n -- numbers from level n to N
inclusively should be dumped and restored.
During struct pid creation first n-1 numbers can be anything, because the're
outside of pid_ns, but the rest should be the same.
Code will be ifdeffed and commented, but anyhow, this is an example of
change C/R will require from the rest of the kernel.
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -182,6 +182,34 @@ static int alloc_pidmap(struct pid_namespace *pid_ns)
return -1;
}
+static int set_pidmap(struct pid_namespace *pid_ns, pid_t pid)
+{
+ int offset;
+ struct pidmap *map;
+
+ offset = pid & BITS_PER_PAGE_MASK;
+ map = &pid_ns->pidmap[pid/BITS_PER_PAGE];
+ if (unlikely(!map->page)) {
+ void *page = kzalloc(PAGE_SIZE, GFP_KERNEL);
+ /*
+ * Free the page if someone raced with us
+ * installing it:
+ */
+ spin_lock_irq(&pidmap_lock);
+ if (map->page)
+ kfree(page);
+ else
+ map->page = page;
+ spin_unlock_irq(&pidmap_lock);
+ if (unlikely(!map->page))
+ return -ENOMEM;
+ }
+ if (test_and_set_bit(offset, map->page))
+ return -EBUSY;
+ atomic_dec(&map->nr_free);
+ return pid;
+}
+
int next_pidmap(struct pid_namespace *pid_ns, int last)
{
int offset;
@@ -239,7 +267,7 @@ void free_pid(struct pid *pid)
call_rcu(&pid->rcu, delayed_put_pid);
}
-struct pid *alloc_pid(struct pid_namespace *ns)
+struct pid *alloc_pid(struct pid_namespace *ns, int *cr_nr, unsigned int cr_level)
{
struct pid *pid;
enum pid_type type;
@@ -253,7 +281,10 @@ struct pid *alloc_pid(struct pid_namespace *ns)
tmp = ns;
for (i = ns->level; i >= 0; i--) {
- nr = alloc_pidmap(tmp);
+ if (cr_nr && ns->level - i <= cr_level)
+ nr = set_pidmap(tmp, cr_nr[ns->level - i]);
+ else
+ nr = alloc_pidmap(tmp);
if (nr < 0)
goto out_free;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists