linux-kernel - WARNING: at kernel/mutex.c:386 __mutex_lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130218013720.GC10963@localhost>
Date:	Mon, 18 Feb 2013 09:37:20 +0800
From:	Fengguang Wu <fengguang.wu@...el.com>
To:	Ben Skeggs <bskeggs@...hat.com>
Cc:	Maarten Lankhorst <maarten.lankhorst@...onical.com>,
	linux-kernel@...r.kernel.org
Subject: WARNING: at kernel/mutex.c:386 __mutex_lock_common()

Greetings,

I got the below warning and the first bad commit is

commit 11eb5a2ce9655ee552f705f2da6b8f96f466c0d4
Author: Maarten Lankhorst <maarten.lankhorst@...onical.com>
Date:   Wed Nov 7 11:53:51 2012 +0100

    mutex: add support for reservation style locks
    
    GPU's do operations that commonly involve many buffers.  Those buffers
    can be shared across contexts/processes, exist in different memory
    domains (for example VRAM vs system memory), and so on.  And with
    PRIME / dmabuf, they can even be shared across devices.  So there are
    a handful of situations where the driver needs to wait for buffers to
    become ready.  If you think about this in terms of waiting on a buffer
    mutex for it to become available, this presents a problem because
    there is no way to guarantee that buffers appear in a execbuf/batch in
    the same order in all contexts.  That is directly under control of
    userspace, and a result of the sequence of GL calls that an application
    makes.  Which results in the potential for deadlock.  The problem gets
    more complex when you consider that the kernel may need to migrate the
    buffer(s) into VRAM before the GPU operates on the buffer(s), which
    may in turn require evicting some other buffers (and you don't want to
    evict other buffers which are already queued up to the GPU), but for a
    simplified understanding of the problem you can ignore this.
    
    The algorithm that TTM came up with for dealing with this problem is
    quite simple.  For each group of buffers (execbuf) that need to be
    locked, the caller would be assigned a unique reservation_id, from a
    global counter.  In case of deadlock in the process of locking all the
    buffers associated with a execbuf, the one with the lowest
    reservation_id wins, and the one with the higher reservation_id
    unlocks all of the buffers that it has already locked, and then tries
    again.
    
    How it is used:
    ---------------
    
    A very simplified version:
    
        int lock_execbuf(execbuf)
        {
            struct buf *res_buf = NULL;
    
            /* acquiring locks, before queuing up to GPU: */
            seqno = assign_global_seqno();
    
        retry:
            for (buf in execbuf->buffers) {
                if (buf == res_buf) {
                    res_buf = NULL;
                    continue;
                }
                ret = mutex_reserve_lock(&buf->lock, seqno);
                if (ret < 0)
                    goto err;
            }
    
            /* now everything is good to go, submit job to GPU: */
            ...
    
            return 0;
    
        err:
            for (all buf2 before buf in execbuf->buffers)
                mutex_unreserve_unlock(&buf2->lock);
            if (res_buf)
                mutex_unreserve_unlock(&res_buf->lock);
    
            if (ret == -EAGAIN) {
                /* we lost out in a seqno race, lock and retry.. */
                mutex_reserve_lock_slow(&buf->lock, seqno);
                res_buf = buf;
                goto retry;
            }
    
            return ret;
        }
    
        int unlock_execbuf(execbuf)
        {
            /* when GPU is finished; */
            for (buf in execbuf->buffers)
                mutex_unreserve_unlock(&buf->lock);
        }
    
    Functions:
    ----------
    
    mutex_reserve_lock, and mutex_reserve_lock_interruptible:
      Lock a buffer with a reservation_id set. reservation_id must not be
      set to 0, since this is a special value that means no reservation_id.
    
      Normally if reservation_id is not set, or is older than the
      reservation_id that's currently set on the mutex, the behavior will
      be to wait normally.  However, if  the reservation_id is newer than
      the current reservation_id, -EAGAIN will be returned.
    
      These functions will return -EDEADLK instead of -EAGAIN if
      reservation_id is the same as the reservation_id that's attempted to
      lock the mutex with, since in that case you presumably attempted to
      lock the same lock twice.
    
    mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow:
      Similar to mutex_reserve_lock, except it won't backoff with -EAGAIN.
      This is useful when mutex_reserve_lock failed with -EAGAIN, and you
      unreserved all buffers so no deadlock can occur.
    
    mutex_unreserve_unlock:
       Unlock a buffer reserved with one of the mutex_reserve_*lock* calls.
    
    Missing at the moment, maybe TODO?
      * Check if lockdep warns if you unlock a lock that other locks were nested
        to.
        - spin_lock(m);
          spin_lock_nest_lock(a, m);
          spin_unlock(m);
          spin_unlock(a);
          It would be nice if this would give a splat on spin_unlock(m),
          I'm not 100% sure if it does right now, though..
      * In the *_slow calls, maybe add a check to ensure no other locks of the same
        lockdep class are nested in the lock we have to nest to?
        - This is making sure that mutex_unreserve_unlock have been called on all other locks.
    
    Design:
      I chose for ticket_mutex to encapsulate struct mutex, so the extra memory usage and
      atomic set on init will only happen when you deliberately create a ticket lock.
    
      Since the mutexes are mostly meant to protect buffer object serialization in ttm, not
      much contention is expected. I could be slightly smarter with wakeups, but this would
      be at the expense at adding a field to struct mutex_waiter. This would add
      overhead to all cases where normal mutexes are used, and ticket_mutexes are less
      performance sensitive anyway since they only protect buffer objects. As a result
      I chose not to do this.
    
      I needed this in kernel/mutex.c because of the extensions to __lock_common, which are
      hopefully optimized away for all normal paths.
    
      It is not illegal to use mutex_lock and mutex_unlock on ticket mutexes. This will work,
      as long you don't mix lock/unlock calls. This is useful if you want to lock only a single
      buffer.
    
      All the mutex_reserve calls are nested into another lock. The idea is that the
      seqno ticket you use functions as a lockdep class you have locked too. This will
      prevent lockdep false positives on locking 2 objects of the same class. It's allowed
      because they're nested to the seqno ticket. There are extensive tests for this in the
      patch that introduces locking tests for reservation objects and reservation tickets.
    
    Changes since RFC patch v1:
     - Updated to use atomic_long instead of atomic, since the reservation_id was a long.
     - added mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow
     - removed mutex_locked_set_reservation_id (or w/e it was called)
    Changes since RFC patch v2:
     - remove use of __mutex_lock_retval_arg, add warnings when using wrong combination of
       mutex_(,reserve_)lock/unlock.
    
    Signed-off-by: Maarten Lankhorst <maarten.lankhorst@...onical.com>

[    0.000000] 
[    0.000000] ------------[ cut here ]------------
[    0.000000] WARNING: at /c/kernel-tests/src/tip/kernel/mutex.c:386 __mutex_lock_common+0x5a9/0x870()
[    0.000000] Hardware name: Bochs
[    0.000000] Modules linked in:
[    0.000000] Pid: 0, comm: swapper/0 Not tainted 3.8.0-rc7-00071-g11eb5a2 #180
[    0.000000] Call Trace:
[    0.000000]  [<ffffffff81047102>] warn_slowpath_common+0xb2/0x120
[    0.000000]  [<ffffffff81047195>] warn_slowpath_null+0x25/0x30
[    0.000000]  [<ffffffff814ebdd9>] __mutex_lock_common+0x5a9/0x870
[    0.000000]  [<ffffffff81a66ef8>] ? rcu_cpu_notify+0xa8/0x451
[    0.000000]  [<ffffffff810bcb8f>] ? trace_hardirqs_off_caller+0xaf/0x120
[    0.000000]  [<ffffffff81a66ef8>] ? rcu_cpu_notify+0xa8/0x451
[    0.000000]  [<ffffffff810be38c>] ? lockdep_init_map+0xfc/0x230
[    0.000000]  [<ffffffff814ec621>] mutex_lock_nested+0x61/0x80
[    0.000000]  [<ffffffff810bcc1d>] ? trace_hardirqs_off+0x1d/0x30
[    0.000000]  [<ffffffff81a66ef8>] rcu_cpu_notify+0xa8/0x451
[    0.000000]  [<ffffffff814ecc31>] ? mutex_unlock+0x11/0x20
[    0.000000]  [<ffffffff81a3f952>] rcu_init+0x3b3/0x408
[    0.000000]  [<ffffffff81a21e0c>] start_kernel+0x34a/0x744
[    0.000000]  [<ffffffff81a216e6>] ? repair_env_string+0x81/0x81
[    0.000000]  [<ffffffff81a21120>] ? early_idt_handlers+0x120/0x120
[    0.000000]  [<ffffffff81a212fd>] x86_64_start_reservations+0x185/0x190
[    0.000000]  [<ffffffff81a214a8>] x86_64_start_kernel+0x1a0/0x1b6
[    0.000000] ---[ end trace 8e966724b1809892 ]---

git bisect start 11eb5a2ce9655ee552f705f2da6b8f96f466c0d4 836dc9e3fbbab0c30aa6e664417225f5c1fb1c39 --
git bisect good 569104e21addbec215daa2d1d2c7b2a274ebcc27  #  1000  2013-02-15 00:30:51  drm/nouveau: share fence structures between nv10+ and nv50 implementations
git bisect good dacf4c29d7e3b2e0903fdef6da8296fe51132dca  #  1000  2013-02-15 02:49:28  drm/nouveau/fifo/nvc0: improve interrupt handler somewhat
git bisect good 51fb70f904cf004272bd0a729959e4da76cb50dc  #  1000  2013-02-15 05:08:37  drm/nouveau/gpio: pass number of on-die gpio lines to base
git bisect good 38bc1653a04eeea372a4249ef58d554ce38bd0f5  #  1000  2013-02-15 07:27:46  drm/nv84-/fence: abstract class emit/sync functions to virt+sequence
git bisect good ef80d98bdcaebca98ec9790ee584f903557ca149  #  1000  2013-02-15 09:47:41  drm/nouveau/fence: make internal hooks part of the context
git bisect good 5fdd378c20bb56824d57905591f1ec93a02513b9  #  1000  2013-02-15 12:06:18  arch: make __mutex_fastpath_lock_retval return whether fastpath succeeded or not.
git bisect good 5fdd378c20bb56824d57905591f1ec93a02513b9  #  3001  2013-02-15 18:58:19  arch: make __mutex_fastpath_lock_retval return whether fastpath succeeded or not.
git bisect good 075198d07b7b64b50afb54853dc390ded9962ee4  #  3000  2013-02-16 02:12:51  Revert "mutex: add support for reservation style locks"
git bisect good be03210f1bbdde34b397e86c115511667210fe98  #  3000  2013-02-18 05:35:58  Add linux-next specific files for 20130215

Thanks,
Fengguang

View attachment "dmesg-kvm-xbm-3830-2013-02-14-19-00-45-3.8.0-rc7-00071-g11eb5a2-180" of type "text/plain" (29906 bytes)

View attachment "11eb5a2-bisect.log" of type "text/plain" (30335 bytes)

View attachment ".config-bisect" of type "text/plain" (81978 bytes)