lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4fee4fydxuxzee5cb5ehiil7g7bnhxp5cmxxgg3zszc4vx4qyc@6t2qmltutcrh>
Date: Thu, 26 Sep 2024 14:53:46 +0200
From: Michal Koutný <mkoutny@...e.com>
To: Hillf Danton <hdanton@...a.com>
Cc: Chen Ridong <chenridong@...wei.com>, tj@...nel.org, 
	cgroups@...r.kernel.org, Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>, 
	Boqun Feng <boqun.feng@...il.com>, Linus Torvalds <torvalds@...ux-foundation.org>, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 1/1] cgroup: fix deadlock caused by cgroup_mutex and
 cpu_hotplug_lock

Hello Hillf.

(sorry for later reply)

On Wed, Sep 11, 2024 at 07:15:42PM GMT, Hillf Danton <hdanton@...a.com> wrote:
> > However, there is no ordering between (I) and (II) so they can also happen
> > in opposite
> > 
> > 	thread T					system_wq worker
> > 
> > 	down(cpu_hotplug_lock.read)
> > 	smp_call_on_cpu
> > 	  queue_work_on(cpu, system_wq, scss) (I)
> > 	  						lock(cgroup_mutex)  (II)
> > 							...
> > 							unlock(cgroup_mutex)
> > 							scss.func
> > 	  wait_for_completion(scss)
> > 	up(cpu_hotplug_lock.read)
> > 
> > And here the thread T + system_wq worker effectively call
> > cpu_hotplug_lock and cgroup_mutex in the wrong order. (And since they're
> > two threads, it won't be caught by lockdep.)
> > 
> Given no workqueue work executed without being dequeued, any queued work,
> regardless if they are more than 2048, that acquires cgroup_mutex could not
> prevent the work queued by thread-T from being executed, so thread-T can
> make safe forward progress, therefore with no chance left for the ABBA 
> deadlock you spotted where lockdep fails to work.

Is there a forgotten negation and did you intend to write: "any queued
work ... that acquired cgroup_mutex could prevent"?

Or if the negation is correct, why do you mean that processed work item
is _not_ preventing thread T from running (in the case I left quoted
above)?

Thanks,
Michal

Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ