lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 13 Oct 2017 16:42:28 +0100
From:   Tvrtko Ursulin <tursulin@...ulin.net>
To:     Intel-gfx@...ts.freedesktop.org
Cc:     tursulin@...ulin.net, Tvrtko Ursulin <tvrtko.ursulin@...el.com>,
        Daniel Vetter <daniel.vetter@...ll.ch>,
        Chris Wilson <chris@...is-wilson.co.uk>,
        linux-kernel@...r.kernel.org, Tejun Heo <tj@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>, boqun.feng@...il.com,
        byungchul.park@....com, david@...morbit.com,
        johannes@...solutions.net, oleg@...hat.com
Subject: [RFC v2] locking/lockdep: Don't bunch up all flush_workqueue callers into a single completion cross-release context

From: Tvrtko Ursulin <tvrtko.ursulin@...el.com>

It looks like all completions as created by flush_workqueue map
into a single lock class which creates lockdep false positives.

Example of a false positive:

 [   20.805315] ======================================================
 [   20.805316] WARNING: possible circular locking dependency detected
 [   20.805319] 4.14.0-rc4-CI-CI_DRM_3225+ #1 Tainted: G     U
 [   20.805320] ------------------------------------------------------
 [   20.805322] kworker/6:1H/1438 is trying to acquire lock:
 [   20.805324]  (&mm->mmap_sem){++++}, at: [<ffffffffa01c8e01>] __i915_gem_userptr_get_pages_worker+0x141/0x240 [i915]
 [   20.805355] but now in release context of a crosslock acquired at the following:
 [   20.805357]  ((complete)&this_flusher.done){+.+.}, at: [<ffffffff8190b06d>] wait_for_completion+0x1d/0x20
 [   20.805363] which lock already depends on the new lock.
 [   20.805365]  the existing dependency chain (in reverse order) is:
 [   20.805367] -> #1 ((complete)&this_flusher.done){+.+.}:
 [   20.805372]        __lock_acquire+0x1420/0x15e0
 [   20.805374]        lock_acquire+0xb0/0x200
 [   20.805376]        wait_for_common+0x58/0x210
 [   20.805378]        wait_for_completion+0x1d/0x20
 [   20.805381]        flush_workqueue+0x1af/0x540
 [   20.805400]        i915_gem_userptr_mn_invalidate_range_start+0x13c/0x150 [i915]
 [   20.805404]        __mmu_notifier_invalidate_range_start+0x76/0xc0
 [   20.805406]        unmap_vmas+0x7d/0xa0
 [   20.805408]        unmap_region+0xae/0x110
 [   20.805410]        do_munmap+0x276/0x3f0
 [   20.805411]        vm_munmap+0x67/0x90
 [   20.805413]        SyS_munmap+0xe/0x20
 [   20.805415]        entry_SYSCALL_64_fastpath+0x1c/0xb1
 [   20.805416] -> #0 (&mm->mmap_sem){++++}:
 [   20.805419]        down_read+0x3e/0x70
 [   20.805435]        __i915_gem_userptr_get_pages_worker+0x141/0x240 [i915]
 [   20.805438]        process_one_work+0x233/0x660
 [   20.805440]        worker_thread+0x4e/0x3b0
 [   20.805441]        kthread+0x152/0x190
 [   20.805442] other info that might help us debug this:
 [   20.805445]  Possible unsafe locking scenario by crosslock:
 [   20.805447]        CPU0                    CPU1
 [   20.805448]        ----                    ----
 [   20.805449]   lock(&mm->mmap_sem);
 [   20.805451]   lock((complete)&this_flusher.done);
 [   20.805453]                                lock(&mm->mmap_sem);
 [   20.805455]                                unlock((complete)&this_flusher.done);
 [   20.805457] *** DEADLOCK ***
 [   20.805460] 2 locks held by kworker/6:1H/1438:
 [   20.805461]  #0:  (&(&pool->lock)->rlock){-.-.}, at: [<ffffffff8109c94c>] process_one_work+0x2dc/0x660
 [   20.805465]  #1:  (&x->wait#10){....}, at: [<ffffffff810cd69d>] complete+0x1d/0x60
 [   20.805469] stack backtrace:
 [   20.805472] CPU: 6 PID: 1438 Comm: kworker/6:1H Tainted: G     U          4.14.0-rc4-CI-CI_DRM_3225+ #1
 [   20.805474] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016
 [   20.805480] Call Trace:
 [   20.805483]  dump_stack+0x68/0x9f
 [   20.805486]  print_circular_bug+0x235/0x3c0
 [   20.805488]  ? HARDIRQ_verbose+0x10/0x10
 [   20.805490]  check_prev_add+0x430/0x840
 [   20.805492]  ? ret_from_fork+0x27/0x40
 [   20.805494]  lock_commit_crosslock+0x3ee/0x660
 [   20.805496]  ? lock_commit_crosslock+0x3ee/0x660
 [   20.805498]  complete+0x29/0x60
 [   20.805500]  pwq_dec_nr_in_flight+0x9c/0xa0
 [   20.805502]  ? _raw_spin_lock_irq+0x40/0x50
 [   20.805504]  process_one_work+0x335/0x660
 [   20.805506]  worker_thread+0x4e/0x3b0
 [   20.805508]  kthread+0x152/0x190
 [   20.805509]  ? process_one_work+0x660/0x660
 [   20.805511]  ? kthread_create_on_node+0x40/0x40
 [   20.805513]  ret_from_fork+0x27/0x40

Solution here is is to create a lockdep map for completions
together with the workqueue object, and allow the completions to
be initialized with an external (pointer) lockdep map. This map
is then used over the built-in lockdep_map structure.

This way the completion cross-release context is tied with it's
parent workqueue and should still catch all issues without being
susceptible to false positives.

v2: Simplify by always having a valid map pointer. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@...el.com>
Cc: Daniel Vetter <daniel.vetter@...ll.ch>
Cc: Chris Wilson <chris@...is-wilson.co.uk>
Cc: linux-kernel@...r.kernel.org
Cc: Tejun Heo <tj@...nel.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: boqun.feng@...il.com
Cc: byungchul.park@....com
Cc: david@...morbit.com
Cc: johannes@...solutions.net
Cc: oleg@...hat.com
--
Borrowed the Cc list from a recent fix in the same area.
Apologies if it is too wide.
---
 include/linux/completion.h | 46 +++++++++++++++++++++++++++++-----------------
 include/linux/workqueue.h  | 26 ++++++++++++++++++++++----
 kernel/workqueue.c         | 20 +++++++++++++-------
 3 files changed, 64 insertions(+), 28 deletions(-)

diff --git a/include/linux/completion.h b/include/linux/completion.h
index cae5400022a3..de4ee32ede0f 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -29,24 +29,41 @@ struct completion {
 	unsigned int done;
 	wait_queue_head_t wait;
 #ifdef CONFIG_LOCKDEP_COMPLETIONS
+	struct lockdep_map_cross *pmap;
 	struct lockdep_map_cross map;
 #endif
 };
 
+/**
+ * init_completion - Initialize a dynamically allocated completion
+ * @x:  pointer to completion structure that is to be initialized
+ *
+ * This inline function will initialize a dynamically created completion
+ * structure.
+ */
+static inline void __init_completion(struct completion *x)
+{
+	x->done = 0;
+#ifdef CONFIG_LOCKDEP_COMPLETIONS
+	x->pmap = &x->map;
+#endif
+	init_waitqueue_head(&x->wait);
+}
+
 #ifdef CONFIG_LOCKDEP_COMPLETIONS
 static inline void complete_acquire(struct completion *x)
 {
-	lock_acquire_exclusive((struct lockdep_map *)&x->map, 0, 0, NULL, _RET_IP_);
+	lock_acquire_exclusive((struct lockdep_map *)x->pmap, 0, 0, NULL, _RET_IP_);
 }
 
 static inline void complete_release(struct completion *x)
 {
-	lock_release((struct lockdep_map *)&x->map, 0, _RET_IP_);
+	lock_release((struct lockdep_map *)x->pmap, 0, _RET_IP_);
 }
 
 static inline void complete_release_commit(struct completion *x)
 {
-	lock_commit_crosslock((struct lockdep_map *)&x->map);
+	lock_commit_crosslock((struct lockdep_map *)x->pmap);
 }
 
 #define init_completion(x)						\
@@ -57,8 +74,16 @@ do {									\
 			&__key, 0);					\
 	__init_completion(x);						\
 } while (0)
+
+static inline void init_completion_map(struct completion *x,
+				       struct lockdep_map_cross *map)
+{
+	__init_completion(x);
+	x->pmap = map;
+}
 #else
 #define init_completion(x) __init_completion(x)
+#define init_completion_map(x, m) __init_completion(x)
 static inline void complete_acquire(struct completion *x) {}
 static inline void complete_release(struct completion *x) {}
 static inline void complete_release_commit(struct completion *x) {}
@@ -66,7 +91,7 @@ static inline void complete_release_commit(struct completion *x) {}
 
 #ifdef CONFIG_LOCKDEP_COMPLETIONS
 #define COMPLETION_INITIALIZER(work) \
-	{ 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait), \
+	{ 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait), &(work).map, \
 	STATIC_CROSS_LOCKDEP_MAP_INIT("(complete)" #work, &(work)) }
 #else
 #define COMPLETION_INITIALIZER(work) \
@@ -107,19 +132,6 @@ static inline void complete_release_commit(struct completion *x) {}
 #endif
 
 /**
- * init_completion - Initialize a dynamically allocated completion
- * @x:  pointer to completion structure that is to be initialized
- *
- * This inline function will initialize a dynamically created completion
- * structure.
- */
-static inline void __init_completion(struct completion *x)
-{
-	x->done = 0;
-	init_waitqueue_head(&x->wait);
-}
-
-/**
  * reinit_completion - reinitialize a completion structure
  * @x:  pointer to completion structure that is to be reinitialized
  *
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 1c49431f3121..c6d801d47535 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -373,7 +373,9 @@ extern struct workqueue_struct *system_freezable_power_efficient_wq;
 
 extern struct workqueue_struct *
 __alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
-	struct lock_class_key *key, const char *lock_name, ...) __printf(1, 6);
+		      struct lock_class_key *key, const char *lock_name,
+		      struct lock_class_key *c_key, const char *c_lock_name,
+		      ...) __printf(1, 8);
 
 /**
  * alloc_workqueue - allocate a workqueue
@@ -392,7 +394,21 @@ __alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
  * RETURNS:
  * Pointer to the allocated workqueue on success, %NULL on failure.
  */
-#ifdef CONFIG_LOCKDEP
+#if defined(CONFIG_LOCKDEP_COMPLETIONS)
+#define alloc_workqueue(fmt, flags, max_active, args...)		\
+({									\
+	static struct lock_class_key __key, __c_key;			\
+	const char *__lock_name, *__c_lock_name;			\
+									\
+	__lock_name = #fmt#args;					\
+	__c_lock_name = #fmt#args"c";					\
+									\
+	__alloc_workqueue_key((fmt), (flags), (max_active),		\
+			      &__key, __lock_name,			\
+			      &__c_key, __c_lock_name,			\
+			      ##args);					\
+})
+#elif defined(CONFIG_LOCKDEP)
 #define alloc_workqueue(fmt, flags, max_active, args...)		\
 ({									\
 	static struct lock_class_key __key;				\
@@ -401,12 +417,14 @@ __alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
 	__lock_name = #fmt#args;					\
 									\
 	__alloc_workqueue_key((fmt), (flags), (max_active),		\
-			      &__key, __lock_name, ##args);		\
+			      &__key, __lock_name,			\
+			      NULL, NULL,				\
+			      ##args);					\
 })
 #else
 #define alloc_workqueue(fmt, flags, max_active, args...)		\
 	__alloc_workqueue_key((fmt), (flags), (max_active),		\
-			      NULL, NULL, ##args)
+			      NULL, NULL, NULL, NULL, ##args)
 #endif
 
 /**
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 64d0edf428f8..c267fa7ad1ec 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -263,6 +263,9 @@ struct workqueue_struct {
 #ifdef CONFIG_LOCKDEP
 	struct lockdep_map	lockdep_map;
 #endif
+#ifdef CONFIG_LOCKDEP_COMPLETIONS
+	struct lockdep_map_cross completion_lockdep_map;
+#endif
 	char			name[WQ_NAME_LEN]; /* I: workqueue name */
 
 	/*
@@ -2611,13 +2614,14 @@ void flush_workqueue(struct workqueue_struct *wq)
 	struct wq_flusher this_flusher = {
 		.list = LIST_HEAD_INIT(this_flusher.list),
 		.flush_color = -1,
-		.done = COMPLETION_INITIALIZER_ONSTACK(this_flusher.done),
 	};
 	int next_color;
 
 	if (WARN_ON(!wq_online))
 		return;
 
+	init_completion_map(&this_flusher.done, &wq->completion_lockdep_map);
+
 	lock_map_acquire(&wq->lockdep_map);
 	lock_map_release(&wq->lockdep_map);
 
@@ -3962,11 +3966,11 @@ static int wq_clamp_max_active(int max_active, unsigned int flags,
 	return clamp_val(max_active, 1, lim);
 }
 
-struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
-					       unsigned int flags,
-					       int max_active,
-					       struct lock_class_key *key,
-					       const char *lock_name, ...)
+struct workqueue_struct *
+__alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
+		      struct lock_class_key *key, const char *lock_name,
+		      struct lock_class_key *c_key, const char *c_lock_name,
+		      ...)
 {
 	size_t tbl_size = 0;
 	va_list args;
@@ -4001,7 +4005,7 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
 			goto err_free_wq;
 	}
 
-	va_start(args, lock_name);
+	va_start(args, c_lock_name);
 	vsnprintf(wq->name, sizeof(wq->name), fmt, args);
 	va_end(args);
 
@@ -4019,6 +4023,8 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
 	INIT_LIST_HEAD(&wq->maydays);
 
 	lockdep_init_map(&wq->lockdep_map, lock_name, key, 0);
+	lockdep_init_map_crosslock((struct lockdep_map *)&wq->completion_lockdep_map,
+				   c_lock_name, c_key, 0);
 	INIT_LIST_HEAD(&wq->list);
 
 	if (alloc_and_link_pwqs(wq) < 0)
-- 
2.9.5

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ