lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dd1418f9-93d0-45c9-bcc2-d67f48d050f6@huaweicloud.com>
Date: Fri, 15 Aug 2025 15:29:56 +0800
From: Chen Ridong <chenridong@...weicloud.com>
To: Hillf Danton <hdanton@...a.com>, Michal Koutný
 <mkoutny@...e.com>
Cc: tj@...nel.org, cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
 lujialin4@...wei.com, chenridong@...wei.com, gaoyingjie@...ontech.com
Subject: Re: [PATCH v2 -next] cgroup: remove offline draining in root
 destruction to avoid hung_tasks



On 2025/8/15 10:40, Hillf Danton wrote:
> On Fri, Jul 25, 2025 at 09:42:05AM +0800, Chen Ridong <chenridong@...weicloud.com> wrote:
>>> On Tue, Jul 22, 2025 at 11:27:33AM +0000, Chen Ridong <chenridong@...weicloud.com> wrote:
>>>> CPU0                            CPU1
>>>> mount perf_event                umount net_prio
>>>> cgroup1_get_tree                cgroup_kill_sb
>>>> rebind_subsystems               // root destruction enqueues
>>>> 				// cgroup_destroy_wq
>>>> // kill all perf_event css
>>>>                                 // one perf_event css A is dying
>>>>                                 // css A offline enqueues cgroup_destroy_wq
>>>>                                 // root destruction will be executed first
>>>>                                 css_free_rwork_fn
>>>>                                 cgroup_destroy_root
>>>>                                 cgroup_lock_and_drain_offline
>>>>                                 // some perf descendants are dying
>>>>                                 // cgroup_destroy_wq max_active = 1
>>>>                                 // waiting for css A to die
>>>>
>>>> Problem scenario:
>>>> 1. CPU0 mounts perf_event (rebind_subsystems)
>>>> 2. CPU1 unmounts net_prio (cgroup_kill_sb), queuing root destruction work
>>>> 3. A dying perf_event CSS gets queued for offline after root destruction
>>>> 4. Root destruction waits for offline completion, but offline work is
>>>>    blocked behind root destruction in cgroup_destroy_wq (max_active=1)
>>>
>>> What's concerning me is why umount of net_prio hierarhy waits for
>>> draining of the default hierachy? (Where you then run into conflict with
>>> perf_event that's implicit_on_dfl.)
>>>
> /*
>  * cgroup destruction makes heavy use of work items and there can be a lot
>  * of concurrent destructions.  Use a separate workqueue so that cgroup
>  * destruction work items don't end up filling up max_active of system_wq
>  * which may lead to deadlock.
>  */
> 
> If task hung could be reliably reproduced, it is right time to cut
> max_active off for cgroup_destroy_wq according to its comment.

Hi Danton,

Thank you for your feedback.

While modifying max_active could be a viable solution, I’m unsure whether it might introduce other
side effects. Instead, I’ve proposed an alternative approach in v3 of the patch, which I believe
addresses the issue more comprehensively.

I’d be very grateful if you could take a look and share your thoughts. Your review would be greatly
appreciated!

v3: https://lore.kernel.org/cgroups/20250815070518.1255842-1-chenridong@huaweicloud.com/T/#u

-- 
Best regards,
Ridong


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ