lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220119130221.GA31037@blackbody.suse.cz>
Date:   Wed, 19 Jan 2022 14:02:22 +0100
From:   Michal Koutný <mkoutny@...e.com>
To:     Zhang Qiao <zhangqiao22@...wei.com>
Cc:     Tejun Heo <tj@...nel.org>, lizefan.x@...edance.com,
        hannes@...xchg.org, cgroups@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [Question] set_cpus_allowed_ptr() call failed at cpuset_attach()

On Fri, Jan 14, 2022 at 09:15:06AM +0800, Zhang Qiao <zhangqiao22@...wei.com> wrote:
> 	I found the following warning log on qemu. I migrated a task from one cpuset cgroup to
> another, while I also performed the cpu hotplug operation, and got following calltrace.

Do you have more information on what hotplug event and what error
(from set_cpus_allowed_ptr() you observe? (And what's src/dst cpuset wrt
root/non-root)?

> 	Can we use cpus_read_lock()/cpus_read_unlock() to guarantee that set_cpus_allowed_ptr()
> doesn't fail, as follows:

I'm wondering what can be wrong with the current actors:

    cpuset_can_attach
      down_read(cpuset_rwsem)
        // check all migratees
      up_read(cpuset_rwsem)
                                      [ _cpu_down / cpuhp_setup_state ]
                                      schedule_work
                                      ...
                                      cpuset_hotplug_update_tasks
                                        down_write(cpuset_rwsem)
                                        up_write(cpuset_rwsem)
                                      ... flush_work
                                      [ _cpu_down / cpu_up_down_serialize_trainwrecks ]
    cpuset_attach
      down_write(cpuset_rwsem)
        set_cpus_allowed_ptr(allowed_cpus_weird)
      up_write(cpuset_rwsem)

The statement in cpuset_attach() about cpuset_can_attach() test is not
so strong since task_can_attach() is mostly a pass for non-deadline
tasks. Still, the use of cpuset_rwsem above should synchronize (I may be
mistaken) the changes of cpuset's cpu masks, so I'd be interested about
the details above to understand why the current approach doesn't work.

The additional cpus_read_{,un}lock (when reordered wrt cpuset_rwsem)
may work but your patch should explain why (in what situation).

My .02€,
Michal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ