linux-kernel - Re: [GIT PULL] cgroup changes for v3.15-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 4 Apr 2014 17:14:41 +0800
From:	Li Zefan <lizefan@...wei.com>
To:	Tejun Heo <tj@...nel.org>
CC:	Linus Torvalds <torvalds@...ux-foundation.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	David Miller <davem@...emloft.net>,
	"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
	Linux Containers <containers@...ts.linux-foundation.org>,
	<cgroups@...r.kernel.org>
Subject: Re: [GIT PULL] cgroup changes for v3.15-rc1

On 2014/4/4 3:43, Tejun Heo wrote:
> Hello,
> 
> On Thu, Apr 03, 2014 at 12:01:23PM -0700, Linus Torvalds wrote:
>> [ Extending the participants list a bit ]
>>
>> On Thu, Apr 3, 2014 at 11:34 AM, Tejun Heo <tj@...nel.org> wrote:
>>> On the road so sending from phone. Iirc the param is necessary to
>>> distinguishe when a new sb is created so that it can be put properly later.
>>> I think cgroup is leaking super ref now and li was planning to send a fix
>>> once things are merged.
>>
>> So as far as I can tell, cgroup is fine, because the superblock itself
>> is properly refcounted by the mounting code. It's the magic hidden
> 
> Ah, I remembered the other way around.  We could leak cgroup_root
> reference, not the other way around.  cgroup_mount() can be called
> multiple times for the same sb and we inc cgroup_root's ref each time
> but cgroup_kill_sb() only happens when the sb is released, so if we do
> the following,
> 
>  # mkdir cpuset cpuset1
>  # mount -t cgroup -o cpuset cgroup cpuset
>  # mount -t cgroup -o cpuset cgroup cpuset1
>  # umount cpuset
>  # umount cpuset1
> 
> The cgroup_root should be destroyed but it isn't, I think.  We'd need
> to bump cgroup_root's refcnt only when a new sb is created.

Yeah, I had been waiting for the kernfs change to be merged into mainline,
so I can fix this cgroup refcnting, and here is the fix.

====================

[PATCH] cgroup: fix top cgroup refcnt leak

As mount() and kill_sb() is not a one-to-one match, If we mount the same
cgroupfs in serveral mount points, and then umount all of them, kill_sb()
will be called only once.

Try:
        # mount -t cgroup -o cpuacct xxx /cgroup
        # mount -t cgroup -o cpuacct xxx /cgroup2
        # cat /proc/cgroups | grep cpuacct
        cpuacct 2       1       1
        # umount /cgroup
        # umount /cgroup2
        # cat /proc/cgroups | grep cpuacct
        cpuacct 2       1       1

You'll see cgroupfs will never be freed.

Signed-off-by: Li Zefan <lizefan@...wei.com>
---
 kernel/cgroup.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index 2189462..48bd9c9 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -1487,6 +1487,7 @@ static struct dentry *cgroup_mount(struct file_system_type *fs_type,
 	struct cgroup_sb_opts opts;
 	struct dentry *dentry;
 	int ret;
+	bool new_sb;
 
 	/*
 	 * The first time anyone tries to mount a cgroup, enable the list
@@ -1603,8 +1604,8 @@ out_unlock:
 	if (ret)
 		return ERR_PTR(ret);
 
-	dentry = kernfs_mount(fs_type, flags, root->kf_root, NULL);
-	if (IS_ERR(dentry))
+	dentry = kernfs_mount(fs_type, flags, root->kf_root, &new_sb);
+	if (IS_ERR(dentry) || !new_sb)
 		cgroup_put(&root->cgrp);
 	return dentry;
 }
-- 
1.8.0.2


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/