[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y35Swdpq+rJe+Tu3@slm.duckdns.org>
Date: Wed, 23 Nov 2022 07:05:05 -1000
From: Tejun Heo <tj@...nel.org>
To: "haifeng.xu" <haifeng.xu@...pee.com>
Cc: longman@...hat.com, lizefan.x@...edance.com, hannes@...xchg.org,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] cgroup/cpuset: Optimize update_tasks_nodemask()
On Wed, Nov 23, 2022 at 08:21:57AM +0000, haifeng.xu wrote:
> When change the 'cpuset.mems' under some cgroup, system will hung
> for a long time. From the dmesg, many processes or theads are
> stuck in fork/exit. The reason is show as follows.
>
> thread A:
> cpuset_write_resmask /* takes cpuset_rwsem */
> ...
> update_tasks_nodemask
> mpol_rebind_mm /* waits mmap_lock */
>
> thread B:
> worker_thread
> ...
> cpuset_migrate_mm_workfn
> do_migrate_pages /* takes mmap_lock */
>
> thread C:
> cgroup_procs_write /* takes cgroup_mutex and cgroup_threadgroup_rwsem */
> ...
> cpuset_can_attach
> percpu_down_write /* waits cpuset_rwsem */
>
> Once update the nodemasks of cpuset, thread A wakes up thread B to
> migrate mm. But when thread A iterates through all tasks, including
> child threads and group leader, it has to wait the mmap_lock which
> has been take by thread B. Unfortunately, thread C wants to migrate
> tasks into cgroup at this moment, it must wait thread A to release
> cpuset_rwsem. If thread B spends much time to migrate mm, the
> fork/exit which acquire cgroup_threadgroup_rwsem also need to
> wait for a long time.
>
> There is no need to migrate the mm of child threads which is
> shared with group leader.
This is only a problem in cgroup1 and cgroup1 doesn't require the threads of
a given task to be in the same cgroup. I don't think you can optimize it
this way.
Thanks.
--
tejun
Powered by blists - more mailing lists