lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 12 Jul 2022 18:14:07 +0100
From:   Qais Yousef <qais.yousef@....com>
To:     Tejun Heo <tj@...nel.org>
Cc:     Xuewen Yan <xuewen.yan@...soc.com>, rafael@...nel.org,
        viresh.kumar@...aro.org, mingo@...hat.com, peterz@...radead.org,
        juri.lelli@...hat.com, vincent.guittot@...aro.org,
        dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
        mgorman@...e.de, bristot@...hat.com, linux-kernel@...r.kernel.org,
        ke.wang@...soc.com, xuewyan@...mail.com, linux-pm@...r.kernel.org,
        Waiman Long <longman@...hat.com>
Subject: Re: [PATCH] sched/schedutil: Fix deadlock between cpuset and cpu
 hotplug when using schedutil

On 07/12/22 06:13, Tejun Heo wrote:
> On Tue, Jul 12, 2022 at 01:57:02PM +0100, Qais Yousef wrote:
> > Is there a lot of subsystems beside cpuset that needs the cpus_read_lock()?
> > A quick grep tells me it's the only one.
> > 
> > Can't we instead use cpus_read_trylock() in cpuset_can_attach() so that we
> > either hold the lock successfully then before we go ahead and call
> > cpuset_attach(), or bail out and cancel the whole attach operation which should
> > unlock the threadgroup_rwsem() lock?
> 
> But now we're failing user-initiated operations randomly. I have a hard time

True. That might appear more random than necessary. It looked neat and
I thought since hotplug operations aren't that common and users must be
prepared for failures for other reasons, it might be okay.

> seeing that as an acceptable solution. The only thing we can do, I think, is
> establishing a locking order between the two locks by either nesting

That might be enough if no other paths can exist which would hold them in
reverse order again. It would be more robust to either hold them both or wait
until we can. Then potential ordering problems can't happen again, because of
this path at least.

> threadgroup_rwsem under cpus_read_lock or disallowing thread creation during
> hotplug operations.

I think that's what Xuewen tried to do in the proposed patch. But it fixes it
for a specific user. If we go with that we'll need nuts and bolts to help warn
when other users do that.


Thanks

--
Qais Yousef

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ