[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e0bb137f-f5f6-78d7-45cf-9710dc5a0608@redhat.com>
Date: Fri, 22 Sep 2017 09:00:04 -0400
From: Waiman Long <longman@...hat.com>
To: shuwang@...hat.com, tj@...nel.org, lizefan@...wei.com,
mingo@...nel.org, keescook@...omium.org
Cc: linux-kernel@...r.kernel.org, chuhu@...hat.com, yizhan@...hat.com
Subject: Re: [PATCH] cgroup: cpuset: fix panic when offline a cpu
On 09/22/2017 07:00 AM, shuwang@...hat.com wrote:
> From: Shu Wang <shuwang@...hat.com>
>
> cgroup_migrate assumes mgctx tset.csets is pointing to
> tset.src_csets at start, add tasks to tset.src_csets in
> func cgroup_migrate_add_task, change test.csets to
> tset.dst_csets in cgroup_migrate_execute.
>
> For offline a cpu in cgroup_transfer_tasks, it will first
> migrate a task and cause tset.csets pointing to dst_csets.
> Get a NULL pointer in cgroup_taskset_first.
>
> reproducer on my 2 cpus machine:
> mkdir /sys/fs/cgroup/cpuset/test
> cd /sys/fs/cgroup/cpuset/test
> echo 1 > cpuset.cpus
> echo 0 > cpuset.mems
> sleep 100 & echo $! > tasks
> sleep 100 & echo $! > tasks
> echo 0 > /sys/bus/cpu/devices/cpu1/online
>
> backtrace:
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000cf8
> IP: cpuset_can_attach+0x2f/0x140
> Call Trace:
> ? cpuset_attach+0x30f/0x3d0
> cgroup_migrate_execute+0x71/0x3c0
> cgroup_migrate+0x75/0x80
> cgroup_transfer_tasks+0x1b2/0x230
> cpuset_hotplug_workfn+0xa7d/0xce0
> ? finish_task_switch+0x79/0x240
> process_one_work+0x149/0x360
> worker_thread+0x4d/0x3c0
>
> Signed-off-by: Shu Wang <shuwang@...hat.com>
> ---
> kernel/cgroup/cgroup-v1.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
> index 024085daab1a..165734573b5e 100644
> --- a/kernel/cgroup/cgroup-v1.c
> +++ b/kernel/cgroup/cgroup-v1.c
> @@ -129,6 +129,12 @@ int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from)
> css_task_iter_end(&it);
>
> if (task) {
> + /*
> + * Reset csets to src_cets, as cgroup_migrate assumes
> + * csets is pointing to src_csets.
> + */
> + mgctx.tset.csets = &mgctx.tset.src_csets;
> +
> ret = cgroup_migrate(task, false, &mgctx);
> if (!ret)
> trace_cgroup_transfer_tasks(to, task, false);
I had actually sent a patch to fix the same bug yesterday. See
https://lkml.org/lkml/2017/9/21/333
Cheers,
Longman
Powered by blists - more mailing lists