lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANrsvRNswL003jsiO-GEWJfrPUS7KKc=J1OQdmLvJiVFOdOMiw@mail.gmail.com>
Date:   Sat, 14 Oct 2017 08:42:09 +0900
From:   Byungchul Park <max.byungchul.park@...il.com>
To:     Tvrtko Ursulin <tursulin@...ulin.net>
Cc:     Intel-gfx@...ts.freedesktop.org,
        Tvrtko Ursulin <tvrtko.ursulin@...el.com>,
        Daniel Vetter <daniel.vetter@...ll.ch>,
        Chris Wilson <chris@...is-wilson.co.uk>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Tejun Heo <tj@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Boqun Feng <boqun.feng@...il.com>,
        "byungchul.park@....com" <byungchul.park@....com>,
        david@...morbit.com, Johannes Berg <johannes@...solutions.net>,
        oleg@...hat.com, kernel-team@....com
Subject: Re: [RFC v2] locking/lockdep: Don't bunch up all flush_workqueue
 callers into a single completion cross-release context

On Sat, Oct 14, 2017 at 12:42 AM, Tvrtko Ursulin <tursulin@...ulin.net> wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@...el.com>
>
> It looks like all completions as created by flush_workqueue map
> into a single lock class which creates lockdep false positives.

Hello,

It's already in progress. Please take a look at the following.

   https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1487015.html

Peter said he has no time to look at it atm. I believe that's why the
review has been
dragging on. We need to wait for him to be available.

Thanks,
Byungchul

> Example of a false positive:
>
>  [   20.805315] ======================================================
>  [   20.805316] WARNING: possible circular locking dependency detected
>  [   20.805319] 4.14.0-rc4-CI-CI_DRM_3225+ #1 Tainted: G     U
>  [   20.805320] ------------------------------------------------------
>  [   20.805322] kworker/6:1H/1438 is trying to acquire lock:
>  [   20.805324]  (&mm->mmap_sem){++++}, at: [<ffffffffa01c8e01>] __i915_gem_userptr_get_pages_worker+0x141/0x240 [i915]
>  [   20.805355] but now in release context of a crosslock acquired at the following:
>  [   20.805357]  ((complete)&this_flusher.done){+.+.}, at: [<ffffffff8190b06d>] wait_for_completion+0x1d/0x20
>  [   20.805363] which lock already depends on the new lock.
>  [   20.805365]  the existing dependency chain (in reverse order) is:
>  [   20.805367] -> #1 ((complete)&this_flusher.done){+.+.}:
>  [   20.805372]        __lock_acquire+0x1420/0x15e0
>  [   20.805374]        lock_acquire+0xb0/0x200
>  [   20.805376]        wait_for_common+0x58/0x210
>  [   20.805378]        wait_for_completion+0x1d/0x20
>  [   20.805381]        flush_workqueue+0x1af/0x540
>  [   20.805400]        i915_gem_userptr_mn_invalidate_range_start+0x13c/0x150 [i915]
>  [   20.805404]        __mmu_notifier_invalidate_range_start+0x76/0xc0
>  [   20.805406]        unmap_vmas+0x7d/0xa0
>  [   20.805408]        unmap_region+0xae/0x110
>  [   20.805410]        do_munmap+0x276/0x3f0
>  [   20.805411]        vm_munmap+0x67/0x90
>  [   20.805413]        SyS_munmap+0xe/0x20
>  [   20.805415]        entry_SYSCALL_64_fastpath+0x1c/0xb1
>  [   20.805416] -> #0 (&mm->mmap_sem){++++}:
>  [   20.805419]        down_read+0x3e/0x70
>  [   20.805435]        __i915_gem_userptr_get_pages_worker+0x141/0x240 [i915]
>  [   20.805438]        process_one_work+0x233/0x660
>  [   20.805440]        worker_thread+0x4e/0x3b0
>  [   20.805441]        kthread+0x152/0x190
>  [   20.805442] other info that might help us debug this:
>  [   20.805445]  Possible unsafe locking scenario by crosslock:
>  [   20.805447]        CPU0                    CPU1
>  [   20.805448]        ----                    ----
>  [   20.805449]   lock(&mm->mmap_sem);
>  [   20.805451]   lock((complete)&this_flusher.done);
>  [   20.805453]                                lock(&mm->mmap_sem);
>  [   20.805455]                                unlock((complete)&this_flusher.done);
>  [   20.805457] *** DEADLOCK ***
>  [   20.805460] 2 locks held by kworker/6:1H/1438:
>  [   20.805461]  #0:  (&(&pool->lock)->rlock){-.-.}, at: [<ffffffff8109c94c>] process_one_work+0x2dc/0x660
>  [   20.805465]  #1:  (&x->wait#10){....}, at: [<ffffffff810cd69d>] complete+0x1d/0x60
>  [   20.805469] stack backtrace:
>  [   20.805472] CPU: 6 PID: 1438 Comm: kworker/6:1H Tainted: G     U          4.14.0-rc4-CI-CI_DRM_3225+ #1
>  [   20.805474] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016
>  [   20.805480] Call Trace:
>  [   20.805483]  dump_stack+0x68/0x9f
>  [   20.805486]  print_circular_bug+0x235/0x3c0
>  [   20.805488]  ? HARDIRQ_verbose+0x10/0x10
>  [   20.805490]  check_prev_add+0x430/0x840
>  [   20.805492]  ? ret_from_fork+0x27/0x40
>  [   20.805494]  lock_commit_crosslock+0x3ee/0x660
>  [   20.805496]  ? lock_commit_crosslock+0x3ee/0x660
>  [   20.805498]  complete+0x29/0x60
>  [   20.805500]  pwq_dec_nr_in_flight+0x9c/0xa0
>  [   20.805502]  ? _raw_spin_lock_irq+0x40/0x50
>  [   20.805504]  process_one_work+0x335/0x660
>  [   20.805506]  worker_thread+0x4e/0x3b0
>  [   20.805508]  kthread+0x152/0x190
>  [   20.805509]  ? process_one_work+0x660/0x660
>  [   20.805511]  ? kthread_create_on_node+0x40/0x40
>  [   20.805513]  ret_from_fork+0x27/0x40
>
> Solution here is is to create a lockdep map for completions
> together with the workqueue object, and allow the completions to
> be initialized with an external (pointer) lockdep map. This map
> is then used over the built-in lockdep_map structure.
>
> This way the completion cross-release context is tied with it's
> parent workqueue and should still catch all issues without being
> susceptible to false positives.
>
> v2: Simplify by always having a valid map pointer. (Chris Wilson)
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@...el.com>
> Cc: Daniel Vetter <daniel.vetter@...ll.ch>
> Cc: Chris Wilson <chris@...is-wilson.co.uk>
> Cc: linux-kernel@...r.kernel.org
> Cc: Tejun Heo <tj@...nel.org>
> Cc: Linus Torvalds <torvalds@...ux-foundation.org>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Cc: boqun.feng@...il.com
> Cc: byungchul.park@....com
> Cc: david@...morbit.com
> Cc: johannes@...solutions.net
> Cc: oleg@...hat.com
> --
> Borrowed the Cc list from a recent fix in the same area.
> Apologies if it is too wide.
> ---
>  include/linux/completion.h | 46 +++++++++++++++++++++++++++++-----------------
>  include/linux/workqueue.h  | 26 ++++++++++++++++++++++----
>  kernel/workqueue.c         | 20 +++++++++++++-------
>  3 files changed, 64 insertions(+), 28 deletions(-)
>
> diff --git a/include/linux/completion.h b/include/linux/completion.h
> index cae5400022a3..de4ee32ede0f 100644
> --- a/include/linux/completion.h
> +++ b/include/linux/completion.h
> @@ -29,24 +29,41 @@ struct completion {
>         unsigned int done;
>         wait_queue_head_t wait;
>  #ifdef CONFIG_LOCKDEP_COMPLETIONS
> +       struct lockdep_map_cross *pmap;
>         struct lockdep_map_cross map;
>  #endif
>  };
>
> +/**
> + * init_completion - Initialize a dynamically allocated completion
> + * @x:  pointer to completion structure that is to be initialized
> + *
> + * This inline function will initialize a dynamically created completion
> + * structure.
> + */
> +static inline void __init_completion(struct completion *x)
> +{
> +       x->done = 0;
> +#ifdef CONFIG_LOCKDEP_COMPLETIONS
> +       x->pmap = &x->map;
> +#endif
> +       init_waitqueue_head(&x->wait);
> +}
> +
>  #ifdef CONFIG_LOCKDEP_COMPLETIONS
>  static inline void complete_acquire(struct completion *x)
>  {
> -       lock_acquire_exclusive((struct lockdep_map *)&x->map, 0, 0, NULL, _RET_IP_);
> +       lock_acquire_exclusive((struct lockdep_map *)x->pmap, 0, 0, NULL, _RET_IP_);
>  }
>
>  static inline void complete_release(struct completion *x)
>  {
> -       lock_release((struct lockdep_map *)&x->map, 0, _RET_IP_);
> +       lock_release((struct lockdep_map *)x->pmap, 0, _RET_IP_);
>  }
>
>  static inline void complete_release_commit(struct completion *x)
>  {
> -       lock_commit_crosslock((struct lockdep_map *)&x->map);
> +       lock_commit_crosslock((struct lockdep_map *)x->pmap);
>  }
>
>  #define init_completion(x)                                             \
> @@ -57,8 +74,16 @@ do {                                                                 \
>                         &__key, 0);                                     \
>         __init_completion(x);                                           \
>  } while (0)
> +
> +static inline void init_completion_map(struct completion *x,
> +                                      struct lockdep_map_cross *map)
> +{
> +       __init_completion(x);
> +       x->pmap = map;
> +}
>  #else
>  #define init_completion(x) __init_completion(x)
> +#define init_completion_map(x, m) __init_completion(x)
>  static inline void complete_acquire(struct completion *x) {}
>  static inline void complete_release(struct completion *x) {}
>  static inline void complete_release_commit(struct completion *x) {}
> @@ -66,7 +91,7 @@ static inline void complete_release_commit(struct completion *x) {}
>
>  #ifdef CONFIG_LOCKDEP_COMPLETIONS
>  #define COMPLETION_INITIALIZER(work) \
> -       { 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait), \
> +       { 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait), &(work).map, \
>         STATIC_CROSS_LOCKDEP_MAP_INIT("(complete)" #work, &(work)) }
>  #else
>  #define COMPLETION_INITIALIZER(work) \
> @@ -107,19 +132,6 @@ static inline void complete_release_commit(struct completion *x) {}
>  #endif
>
>  /**
> - * init_completion - Initialize a dynamically allocated completion
> - * @x:  pointer to completion structure that is to be initialized
> - *
> - * This inline function will initialize a dynamically created completion
> - * structure.
> - */
> -static inline void __init_completion(struct completion *x)
> -{
> -       x->done = 0;
> -       init_waitqueue_head(&x->wait);
> -}
> -
> -/**
>   * reinit_completion - reinitialize a completion structure
>   * @x:  pointer to completion structure that is to be reinitialized
>   *
> diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
> index 1c49431f3121..c6d801d47535 100644
> --- a/include/linux/workqueue.h
> +++ b/include/linux/workqueue.h
> @@ -373,7 +373,9 @@ extern struct workqueue_struct *system_freezable_power_efficient_wq;
>
>  extern struct workqueue_struct *
>  __alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
> -       struct lock_class_key *key, const char *lock_name, ...) __printf(1, 6);
> +                     struct lock_class_key *key, const char *lock_name,
> +                     struct lock_class_key *c_key, const char *c_lock_name,
> +                     ...) __printf(1, 8);
>
>  /**
>   * alloc_workqueue - allocate a workqueue
> @@ -392,7 +394,21 @@ __alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
>   * RETURNS:
>   * Pointer to the allocated workqueue on success, %NULL on failure.
>   */
> -#ifdef CONFIG_LOCKDEP
> +#if defined(CONFIG_LOCKDEP_COMPLETIONS)
> +#define alloc_workqueue(fmt, flags, max_active, args...)               \
> +({                                                                     \
> +       static struct lock_class_key __key, __c_key;                    \
> +       const char *__lock_name, *__c_lock_name;                        \
> +                                                                       \
> +       __lock_name = #fmt#args;                                        \
> +       __c_lock_name = #fmt#args"c";                                   \
> +                                                                       \
> +       __alloc_workqueue_key((fmt), (flags), (max_active),             \
> +                             &__key, __lock_name,                      \
> +                             &__c_key, __c_lock_name,                  \
> +                             ##args);                                  \
> +})
> +#elif defined(CONFIG_LOCKDEP)
>  #define alloc_workqueue(fmt, flags, max_active, args...)               \
>  ({                                                                     \
>         static struct lock_class_key __key;                             \
> @@ -401,12 +417,14 @@ __alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
>         __lock_name = #fmt#args;                                        \
>                                                                         \
>         __alloc_workqueue_key((fmt), (flags), (max_active),             \
> -                             &__key, __lock_name, ##args);             \
> +                             &__key, __lock_name,                      \
> +                             NULL, NULL,                               \
> +                             ##args);                                  \
>  })
>  #else
>  #define alloc_workqueue(fmt, flags, max_active, args...)               \
>         __alloc_workqueue_key((fmt), (flags), (max_active),             \
> -                             NULL, NULL, ##args)
> +                             NULL, NULL, NULL, NULL, ##args)
>  #endif
>
>  /**
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 64d0edf428f8..c267fa7ad1ec 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -263,6 +263,9 @@ struct workqueue_struct {
>  #ifdef CONFIG_LOCKDEP
>         struct lockdep_map      lockdep_map;
>  #endif
> +#ifdef CONFIG_LOCKDEP_COMPLETIONS
> +       struct lockdep_map_cross completion_lockdep_map;
> +#endif
>         char                    name[WQ_NAME_LEN]; /* I: workqueue name */
>
>         /*
> @@ -2611,13 +2614,14 @@ void flush_workqueue(struct workqueue_struct *wq)
>         struct wq_flusher this_flusher = {
>                 .list = LIST_HEAD_INIT(this_flusher.list),
>                 .flush_color = -1,
> -               .done = COMPLETION_INITIALIZER_ONSTACK(this_flusher.done),
>         };
>         int next_color;
>
>         if (WARN_ON(!wq_online))
>                 return;
>
> +       init_completion_map(&this_flusher.done, &wq->completion_lockdep_map);
> +
>         lock_map_acquire(&wq->lockdep_map);
>         lock_map_release(&wq->lockdep_map);
>
> @@ -3962,11 +3966,11 @@ static int wq_clamp_max_active(int max_active, unsigned int flags,
>         return clamp_val(max_active, 1, lim);
>  }
>
> -struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
> -                                              unsigned int flags,
> -                                              int max_active,
> -                                              struct lock_class_key *key,
> -                                              const char *lock_name, ...)
> +struct workqueue_struct *
> +__alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
> +                     struct lock_class_key *key, const char *lock_name,
> +                     struct lock_class_key *c_key, const char *c_lock_name,
> +                     ...)
>  {
>         size_t tbl_size = 0;
>         va_list args;
> @@ -4001,7 +4005,7 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
>                         goto err_free_wq;
>         }
>
> -       va_start(args, lock_name);
> +       va_start(args, c_lock_name);
>         vsnprintf(wq->name, sizeof(wq->name), fmt, args);
>         va_end(args);
>
> @@ -4019,6 +4023,8 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
>         INIT_LIST_HEAD(&wq->maydays);
>
>         lockdep_init_map(&wq->lockdep_map, lock_name, key, 0);
> +       lockdep_init_map_crosslock((struct lockdep_map *)&wq->completion_lockdep_map,
> +                                  c_lock_name, c_key, 0);
>         INIT_LIST_HEAD(&wq->list);
>
>         if (alloc_and_link_pwqs(wq) < 0)
> --
> 2.9.5
>



-- 
Thanks,
Byungchul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ