[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c5028595-7e76-60f2-e744-c2815f917ee4@huawei.com>
Date: Mon, 17 Jan 2022 14:27:28 +0800
From: Zhang Qiao <zhangqiao22@...wei.com>
To: Waiman Long <longman@...hat.com>
CC: <lizefan.x@...edance.com>, <hannes@...xchg.org>,
<cgroups@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
Michal Koutný <mkoutny@...e.com>,
Tejun Heo <tj@...nel.org>
Subject: Re: [Question] set_cpus_allowed_ptr() call failed at cpuset_attach()
在 2022/1/17 12:35, Waiman Long 写道:
> On 1/16/22 21:25, Zhang Qiao wrote:
>> hello
>>
>> 在 2022/1/15 4:33, Waiman Long 写道:
>>> On 1/14/22 11:20, Tejun Heo wrote:
>>>> (cc'ing Waiman and Michal and quoting whole body)
>>>>
>>>> Seems sane to me but let's hear what Waiman and Michal think.
>>>>
>>>> On Fri, Jan 14, 2022 at 09:15:06AM +0800, Zhang Qiao wrote:
>>>>> Hello everyone
>>>>>
>>>>> I found the following warning log on qemu. I migrated a task from one cpuset cgroup to
>>>>> another, while I also performed the cpu hotplug operation, and got following calltrace.
>>>>>
>>>>> This may lead to a inconsistency between the affinity of the task and cpuset.cpus of the
>>>>> dest cpuset, but this task can be successfully migrated to the dest cpuset cgroup.
>>>>>
>>>>> Can we use cpus_read_lock()/cpus_read_unlock() to guarantee that set_cpus_allowed_ptr()
>>>>> doesn't fail, as follows:
>>>>>
>>>>> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
>>>>> index d0e163a02099..2535d23d2c51 100644
>>>>> --- a/kernel/cgroup/cpuset.c
>>>>> +++ b/kernel/cgroup/cpuset.c
>>>>> @@ -2265,6 +2265,7 @@ static void cpuset_attach(struct cgroup_taskset *tset)
>>>>> guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
>>>>>
>>>>> cgroup_taskset_for_each(task, css, tset) {
>>>>> + cpus_read_lock();
>>>>> if (cs != &top_cpuset)
>>>>> guarantee_online_cpus(task, cpus_attach);
>>>>> else
>>>>> @@ -2274,6 +2275,7 @@ static void cpuset_attach(struct cgroup_taskset *tset)
>>>>> * fail. TODO: have a better way to handle failure here
>>>>> */
>>>>> WARN_ON_ONCE(set_cpus_allowed_ptr(task, cpus_attach));
>>>>> + cpus_read_unlock();
>>>>>
>>>>>
>>>>> Is there a better solution?
>>>>>
>>>>> Thanks
>>> The change looks OK to me. However, we may need to run the full set of regression test to make sure that lockdep won't complain about potential deadlock.
>>>
>> I run the test with lockdep enabled, and got lockdep warning like that below.
>> so we should take the cpu_hotplug_lock first, then take the cpuset_rwsem lock.
>>
>> thanks,
>> Zhang Qiao
>>
>> [ 38.420372] ======================================================
>> [ 38.421339] WARNING: possible circular locking dependency detected
>> [ 38.422312] 5.16.0-rc4+ #13 Not tainted
>> [ 38.422920] ------------------------------------------------------
>> [ 38.423883] bash/594 is trying to acquire lock:
>> [ 38.424595] ffffffff8286afc0 (cpu_hotplug_lock){++++}-{0:0}, at: cpuset_attach+0xc2/0x1e0
>> [ 38.425880]
>> [ 38.425880] but task is already holding lock:
>> [ 38.426787] ffffffff8296a5a0 (&cpuset_rwsem){++++}-{0:0}, at: cpuset_attach+0x3e/0x1e0
>> [ 38.428015]
>> [ 38.428015] which lock already depends on the new lock.
>> [ 38.428015]
>> [ 38.429279]
>> [ 38.429279] the existing dependency chain (in reverse order) is:
>> [ 38.430445]
>> [ 38.430445] -> #1 (&cpuset_rwsem){++++}-{0:0}:
>> [ 38.431371] percpu_down_write+0x42/0x130
>> [ 38.432085] cpuset_css_online+0x2b/0x2e0
>> [ 38.432808] online_css+0x24/0x80
>> [ 38.433411] cgroup_apply_control_enable+0x2fa/0x330
>> [ 38.434273] cgroup_mkdir+0x396/0x4c0
>> [ 38.434930] kernfs_iop_mkdir+0x56/0x80
>> [ 38.435614] vfs_mkdir+0xde/0x190
>> [ 38.436220] do_mkdirat+0x7d/0xf0
>> [ 38.436824] __x64_sys_mkdir+0x21/0x30
>> [ 38.437495] do_syscall_64+0x3a/0x80
>> [ 38.438145] entry_SYSCALL_64_after_hwframe+0x44/0xae
>> [ 38.439015]
>> [ 38.439015] -> #0 (cpu_hotplug_lock){++++}-{0:0}:
>> [ 38.439980] __lock_acquire+0x17f6/0x2260
>> [ 38.440691] lock_acquire+0x277/0x320
>> [ 38.441347] cpus_read_lock+0x37/0xc0
>> [ 38.442011] cpuset_attach+0xc2/0x1e0
>> [ 38.442671] cgroup_migrate_execute+0x3a6/0x490
>> [ 38.443461] cgroup_attach_task+0x22c/0x3d0
>> [ 38.444197] __cgroup1_procs_write.constprop.21+0x10d/0x170
>> [ 38.445145] cgroup_file_write+0x6f/0x230
>> [ 38.445860] kernfs_fop_write_iter+0x130/0x1b0
>> [ 38.446636] new_sync_write+0x120/0x1b0
>> [ 38.447319] vfs_write+0x359/0x3b0
>> [ 38.447937] ksys_write+0xa2/0xe0
>> [ 38.448540] do_syscall_64+0x3a/0x80
>> [ 38.449183] entry_SYSCALL_64_after_hwframe+0x44/0xae
>> [ 38.450057]
>> [ 38.450057] other info that might help us debug this:
>> [ 38.450057]
>> [ 38.451297] Possible unsafe locking scenario:
>> [ 38.451297]
>> [ 38.452218] CPU0 CPU1
>> [ 38.452935] ---- ----
>> [ 38.453650] lock(&cpuset_rwsem);
>> [ 38.454188] lock(cpu_hotplug_lock);
>> [ 38.455148] lock(&cpuset_rwsem);
>> [ 38.456069] lock(cpu_hotplug_lock);
>
> Yes, you need to play around with lock ordering to make sure that lockdep won't complain.
>
Thank you for taking a look!
if ok, i will send a patch.
Thanks,
Zhang Qiao.
> Cheers,
> Longman
>
> .
Powered by blists - more mailing lists