lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.44L0.0908121216500.9635-100000@iolanthe.rowland.org>
Date:	Wed, 12 Aug 2009 12:23:35 -0400 (EDT)
From:	Alan Stern <stern@...land.harvard.edu>
To:	James Bottomley <James.Bottomley@...senPartnership.com>
cc:	Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] Add kerneldoc for flush_scheduled_work()

On Wed, 12 Aug 2009, James Bottomley wrote:

> This all boils down to "the race window for a deadlock may be narrower".
> 
> Instead of training programmers to narrow deadlock races, we should be
> training them to avoid them.
> 
> The entangled deadlock problem occurs in all of our _sync() APIs as well
> as interrupt and other workqueue stuff.
> 
> The rules are something like
> 
> Never use synchronous operations if you can avoid them.  If you must use
> operations that wait for another thread to complete (say because you're
> about to destroy data structures that queued elements may be using):
>      1. Never hold any locks or mutexes while waiting for the completion
>         of synchronous operations
>      2. If you have to hold a lock while waiting:
>              1. If it's a global lock, make sure you're using a local
>                 queue and that nothing you submitted to the queue can
>                 take the lock
>              2. If it's a local lock, you may use a global queue but
>                 must still make sure that nothing you submitted to the
>                 queue can take the lock.

I haven't seen these rules written down anywhere in the kernel 
documentation.  Presumably people are supposed to be aware of them 
already?

Anyway, 2.1 and 2.2 are wrong.  They should read: "... make sure that
nothing submitted to the queue calls any of your routines that can take
the lock."  The point being that even though _you_ don't submit
anything bad to the queue, somebody else might do so.

If you use cancel_work_sync() instead of flush_scheduled_work() then 
the rules become less onerous.  In place of your 2.1 and 2.2, we have:

	Make sure the work item you are cancelling cannot take
	the lock.

This is much easier to verify.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ