linux-kernel - Kernel crash in cgroup_pidlist_destroy_work

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAM_iQpVNzx1r8x-bP5CoiCX8PFk15JYHw_XfpYvJGgdkFHj8Gw@mail.gmail.com>
Date:	Tue, 16 Sep 2014 16:56:16 -0700
From:	Cong Wang <xiyou.wangcong@...il.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	LKML <linux-kernel@...r.kernel.org>, cgroups@...r.kernel.org
Subject: Kernel crash in cgroup_pidlist_destroy_work_fn()

Hi, Tejun

We saw some kernel null pointer dereference in
cgroup_pidlist_destroy_work_fn(), more precisely at
__mutex_lock_slowpath(), on 3.14. I can show you the full stack trace
on request.

Looking at the code, it seems flush_workqueue() doesn't care about new
incoming works, it only processes currently pending ones, if this is
correct, then we could have the following race condition:

cgroup_pidlist_destroy_all():
        //...
        mutex_lock(&cgrp->pidlist_mutex);
        list_for_each_entry_safe(l, tmp_l, &cgrp->pidlists, links)
                mod_delayed_work(cgroup_pidlist_destroy_wq,
&l->destroy_dwork, 0);
        mutex_unlock(&cgrp->pidlist_mutex);

        // <--- another process calls cgroup_pidlist_start() here
since mutex is released

        flush_workqueue(cgroup_pidlist_destroy_wq); // <--- another
process adds new pidlist and queue work in pararell
        BUG_ON(!list_empty(&cgrp->pidlists)); // <--- This check is
passed, list_add() could happen after this

Therefore, the newly added pidlist will point to a freed cgroup, and
when it is freed in the delayed work we will crash.

The attached patch (compile test ONLY) could be a possible fix, since
it will check and hold a refcount on this cgroup in
cgroup_pidlist_start(). But I could very easily miss something here
since there are many cgroup changes after 3.14 and I don't follow
cgroup development.

What do you think?

Thanks.

View attachment "cgroup.diff" of type "text/plain" (938 bytes)