lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y35Swdpq+rJe+Tu3@slm.duckdns.org>
Date:   Wed, 23 Nov 2022 07:05:05 -1000
From:   Tejun Heo <tj@...nel.org>
To:     "haifeng.xu" <haifeng.xu@...pee.com>
Cc:     longman@...hat.com, lizefan.x@...edance.com, hannes@...xchg.org,
        cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] cgroup/cpuset: Optimize update_tasks_nodemask()

On Wed, Nov 23, 2022 at 08:21:57AM +0000, haifeng.xu wrote:
> When change the 'cpuset.mems' under some cgroup, system will hung
> for a long time. From the dmesg, many processes or theads are
> stuck in fork/exit. The reason is show as follows.
> 
> thread A:
> cpuset_write_resmask /* takes cpuset_rwsem */
>   ...
>     update_tasks_nodemask
>       mpol_rebind_mm /* waits mmap_lock */
> 
> thread B:
> worker_thread
>   ...
>     cpuset_migrate_mm_workfn
>       do_migrate_pages /* takes mmap_lock */
> 
> thread C:
> cgroup_procs_write /* takes cgroup_mutex and cgroup_threadgroup_rwsem */
>   ...
>     cpuset_can_attach
>       percpu_down_write /* waits cpuset_rwsem */
> 
> Once update the nodemasks of cpuset, thread A wakes up thread B to
> migrate mm. But when thread A iterates through all tasks, including
> child threads and group leader, it has to wait the mmap_lock which
> has been take by thread B. Unfortunately, thread C wants to migrate
> tasks into cgroup at this moment, it must wait thread A to release
> cpuset_rwsem. If thread B spends much time to migrate mm, the
> fork/exit which acquire cgroup_threadgroup_rwsem also need to
> wait for a long time.
> 
> There is no need to migrate the mm of child threads which is
> shared with group leader. 

This is only a problem in cgroup1 and cgroup1 doesn't require the threads of
a given task to be in the same cgroup. I don't think you can optimize it
this way.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ