linux-kernel - Re: [PATCH] cgroups: defer free css

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <6599ad830811211028v2d9532cfsfef18c0af5935964@mail.gmail.com>
Date:	Fri, 21 Nov 2008 10:28:44 -0800
From:	Paul Menage <menage@...gle.com>
To:	Lai Jiangshan <laijs@...fujitsu.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Linux Containers <containers@...ts.linux-foundation.org>
Subject: Re: [PATCH] cgroups: defer free css_set

On Fri, Nov 21, 2008 at 12:49 AM, Lai Jiangshan <laijs@...fujitsu.com> wrote:
>
> we free css_set when refcnt became 0 immediately(except cgroup_attach_task()).
> I will destroy the data which read side maybe still access it.
> this patch use call_rcu() to defer free css_set
>
> Signed-off-by: Lai Jiangshan <laijs@...fujitsu.com>
> ---
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 1164963..22901ff 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -178,6 +178,8 @@ struct css_set {
>         */
>        struct list_head cg_links;
>
> +       struct rcu_head rcu;
> +
>        /*
>         * Set of subsystem states, one for each subsystem. This array
>         * is immutable after creation apart from the init_css_set
> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
> index 358e775..ddc10ac 100644
> --- a/kernel/cgroup.c
> +++ b/kernel/cgroup.c
> @@ -252,6 +252,11 @@ static void unlink_css_set(struct css_set *cg)
>        }
>  }
>
> +static void rcu_free_css_set(struct rcu_head *head)
> +{
> +       kfree(container_of(head, struct css_set, rcu));
> +}
> +
>  static void __put_css_set(struct css_set *cg, int taskexit)
>  {
>        int i;
> @@ -281,7 +286,7 @@ static void __put_css_set(struct css_set *cg, int taskexit)
>                }
>        }
>        rcu_read_unlock();
> -       kfree(cg);
> +       call_rcu(&cg->rcu, rcu_free_css_set);
>  }
>
>  /*
> @@ -1267,7 +1277,6 @@ int cgroup_attach_task(struct cgroup *cgrp, struct task_struct *tsk)
>                        ss->attach(ss, cgrp, oldcgrp, tsk);
>        }
>        set_bit(CGRP_RELEASABLE, &oldcgrp->flags);
> -       synchronize_rcu();

I'm reluctant to remove this synchronize_rcu() call - it gives the
property that if you get a pointer to a task's cgroup protected by
RCU, then even if you race with the task moving away to a different
cgroup, then no other cgroup_mutex-protected operation can start until
you've finished your RCU section (since the thread that you raced with
is blocking in synchronize_rcu() while holding cgroup_mutex). I'm
pretty sure that some of the cgroups code relies on that property,
although I can't find exactly which bit I'm thinking of.

Also, using call_rcu() for freeing all css_sets seems unnecessary -
the only one that appears to be potentially broken is the one from
cgroup_exit(), since in the other cases the css_set hasn't been
visible via a task->cgroups pointer. So how about making
__put_css_set() do a call_rcu() for the case when taskexit is true,
and a plain free() otherwise? That would also reduce the change of
overloading the RCU system with too many deferred frees.

Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/