[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <83cea8c6-d2f8-42f2-990e-80412ebf296e@huaweicloud.com>
Date: Thu, 12 Sep 2024 09:33:23 +0800
From: Chen Ridong <chenridong@...weicloud.com>
To: Tejun Heo <tj@...nel.org>, Roman Gushchin <roman.gushchin@...ux.dev>
Cc: Michal Koutný <mkoutny@...e.com>,
Chen Ridong <chenridong@...wei.com>, martin.lau@...ux.dev, ast@...nel.org,
daniel@...earbox.net, andrii@...nel.org, eddyz87@...il.com, song@...nel.org,
yonghong.song@...ux.dev, john.fastabend@...il.com, kpsingh@...nel.org,
sdf@...gle.com, haoluo@...gle.com, jolsa@...nel.org,
lizefan.x@...edance.com, hannes@...xchg.org, bpf@...r.kernel.org,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 1/1] cgroup: fix deadlock caused by cgroup_mutex and
cpu_hotplug_lock
On 2024/9/11 5:17, Tejun Heo wrote:
> On Tue, Sep 10, 2024 at 09:02:59PM +0000, Roman Gushchin wrote:
> ...
>>>> By that reasoning any holder of cgroup_mutex on system_wq makes system
>>>> susceptible to a deadlock (in presence of cpu_hotplug_lock waiting
>>>> writers + cpuset operations). And the two work items must meet in same
>>>> worker's processing hence probability is low (zero?) with less than
>>>> WQ_DFL_ACTIVE items.
>>
>> Right, I'm on the same page. Should we document then somewhere that
>> the cgroup mutex can't be locked from a system wq context?
>>
>> I think thus will also make the Fixes tag more meaningful.
>
> I think that's completely fine. What's not fine is saturating system_wq.
> Anything which creates a large number of concurrent work items should be
> using its own workqueue. If anything, workqueue needs to add a warning for
> saturation conditions and who are the offenders.
>
> Thanks.
>
I will add a patch do document that.
Should we modify WQ_DFL_ACTIVE(256 now)? Maybe 1024 is acceptable?
Best regards,
Ridong
Powered by blists - more mailing lists