linux-kernel - Re: [RFC][PATCH 4/4] cgroup-memcg fix frequent EBUSY at rmdir v2

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6599ad830901210200q77b2553ag35f706c321a18d83@mail.gmail.com>
Date:	Wed, 21 Jan 2009 02:00:56 -0800
From:	Paul Menage <menage@...gle.com>
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc:	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"balbir@...ux.vnet.ibm.com" <balbir@...ux.vnet.ibm.com>,
	"nishimura@....nes.nec.co.jp" <nishimura@....nes.nec.co.jp>,
	"lizf@...fujitsu.com" <lizf@...fujitsu.com>
Subject: Re: [RFC][PATCH 4/4] cgroup-memcg fix frequent EBUSY at rmdir v2

On Tue, Jan 20, 2009 at 2:47 AM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@...fujitsu.com> wrote:
>        CGRP_NOTIFY_ON_RELEASE,
> +       /* Someone calls rmdir() and is wating for this cgroup is released */

/* A thread is in rmdir() waiting to destroy this cgroup */

Also document that it can only be set/cleared when you're holding the
inode_sem for the cgroup directory. And we should probably move this
enum inside cgroup.c, since nothing in the header file uses it.

> +       CGRP_WAIT_ON_RMDIR,
>  };

>
>  struct cgroup {
> @@ -350,7 +352,7 @@ int cgroup_is_descendant(const struct cg
>  struct cgroup_subsys {
>        struct cgroup_subsys_state *(*create)(struct cgroup_subsys *ss,
>                                                  struct cgroup *cgrp);
> -       void (*pre_destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);
> +       int (*pre_destroy)(struct cgroup_subsys *ss, struct cgroup *cgrp);

Can you update the documentation to indicate what an error result from
pre_destroy indicates? Can pre_destroy() be called multiple times for
the same subsystem/cgroup?

> +
> +       /* wake up rmdir() waiter....it should fail.*/

/* Wake up rmdir() waiter - the rmdir should fail since the cgroup is
no longer empty */

But is this safe? If we do a pre-destroy, is it OK to let new tasks
into the cgroup?

> @@ -2446,6 +2461,8 @@ static long cgroup_create(struct cgroup
>
>        mutex_unlock(&cgroup_mutex);
>        mutex_unlock(&cgrp->dentry->d_inode->i_mutex);
> +       if (wakeup_on_rmdir(parent))
> +               cgroup_rmdir_wakeup_waiters();

I don't think that there can be a waiter, since rmdir() would hold the
parent's inode semaphore, which would block this thread before it gets
to cgroup_create()

> +DECLARE_WAIT_QUEUE_HEAD(cgroup_rmdir_waitq);
> +
> +static void cgroup_rmdir_wakeup_waiters(void)
> +{
> +       wake_up_all(&cgroup_rmdir_waitq);
> +}
> +

I think you can merge wakeup_on_rmdir() and
cgroup_rmdir_wakeup_waiters() into a single function,
cgroup_wakeup_rmdir(struct cgroup *)


>
> +       if (signal_pending(current))
> +               return -EINTR;

I think it would be better to move this check to after we've already
failed on cgroup_clear_css_refs(). That way we can't fail with an
EINTR just because we raced with a signal on the way into rmdir() - we
have to actually hit the EBUSY and try to sleep.
> +       ret = cgroup_call_pre_destroy(cgrp);
> +       if (ret == -EBUSY)
> +               return -EBUSY;

What about other potential error codes? If the subsystem's only
allowed to return 0 or EBUSY, then we should check for that.

Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/