[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130218013720.GC10963@localhost>
Date: Mon, 18 Feb 2013 09:37:20 +0800
From: Fengguang Wu <fengguang.wu@...el.com>
To: Ben Skeggs <bskeggs@...hat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@...onical.com>,
linux-kernel@...r.kernel.org
Subject: WARNING: at kernel/mutex.c:386 __mutex_lock_common()
Greetings,
I got the below warning and the first bad commit is
commit 11eb5a2ce9655ee552f705f2da6b8f96f466c0d4
Author: Maarten Lankhorst <maarten.lankhorst@...onical.com>
Date: Wed Nov 7 11:53:51 2012 +0100
mutex: add support for reservation style locks
GPU's do operations that commonly involve many buffers. Those buffers
can be shared across contexts/processes, exist in different memory
domains (for example VRAM vs system memory), and so on. And with
PRIME / dmabuf, they can even be shared across devices. So there are
a handful of situations where the driver needs to wait for buffers to
become ready. If you think about this in terms of waiting on a buffer
mutex for it to become available, this presents a problem because
there is no way to guarantee that buffers appear in a execbuf/batch in
the same order in all contexts. That is directly under control of
userspace, and a result of the sequence of GL calls that an application
makes. Which results in the potential for deadlock. The problem gets
more complex when you consider that the kernel may need to migrate the
buffer(s) into VRAM before the GPU operates on the buffer(s), which
may in turn require evicting some other buffers (and you don't want to
evict other buffers which are already queued up to the GPU), but for a
simplified understanding of the problem you can ignore this.
The algorithm that TTM came up with for dealing with this problem is
quite simple. For each group of buffers (execbuf) that need to be
locked, the caller would be assigned a unique reservation_id, from a
global counter. In case of deadlock in the process of locking all the
buffers associated with a execbuf, the one with the lowest
reservation_id wins, and the one with the higher reservation_id
unlocks all of the buffers that it has already locked, and then tries
again.
How it is used:
---------------
A very simplified version:
int lock_execbuf(execbuf)
{
struct buf *res_buf = NULL;
/* acquiring locks, before queuing up to GPU: */
seqno = assign_global_seqno();
retry:
for (buf in execbuf->buffers) {
if (buf == res_buf) {
res_buf = NULL;
continue;
}
ret = mutex_reserve_lock(&buf->lock, seqno);
if (ret < 0)
goto err;
}
/* now everything is good to go, submit job to GPU: */
...
return 0;
err:
for (all buf2 before buf in execbuf->buffers)
mutex_unreserve_unlock(&buf2->lock);
if (res_buf)
mutex_unreserve_unlock(&res_buf->lock);
if (ret == -EAGAIN) {
/* we lost out in a seqno race, lock and retry.. */
mutex_reserve_lock_slow(&buf->lock, seqno);
res_buf = buf;
goto retry;
}
return ret;
}
int unlock_execbuf(execbuf)
{
/* when GPU is finished; */
for (buf in execbuf->buffers)
mutex_unreserve_unlock(&buf->lock);
}
Functions:
----------
mutex_reserve_lock, and mutex_reserve_lock_interruptible:
Lock a buffer with a reservation_id set. reservation_id must not be
set to 0, since this is a special value that means no reservation_id.
Normally if reservation_id is not set, or is older than the
reservation_id that's currently set on the mutex, the behavior will
be to wait normally. However, if the reservation_id is newer than
the current reservation_id, -EAGAIN will be returned.
These functions will return -EDEADLK instead of -EAGAIN if
reservation_id is the same as the reservation_id that's attempted to
lock the mutex with, since in that case you presumably attempted to
lock the same lock twice.
mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow:
Similar to mutex_reserve_lock, except it won't backoff with -EAGAIN.
This is useful when mutex_reserve_lock failed with -EAGAIN, and you
unreserved all buffers so no deadlock can occur.
mutex_unreserve_unlock:
Unlock a buffer reserved with one of the mutex_reserve_*lock* calls.
Missing at the moment, maybe TODO?
* Check if lockdep warns if you unlock a lock that other locks were nested
to.
- spin_lock(m);
spin_lock_nest_lock(a, m);
spin_unlock(m);
spin_unlock(a);
It would be nice if this would give a splat on spin_unlock(m),
I'm not 100% sure if it does right now, though..
* In the *_slow calls, maybe add a check to ensure no other locks of the same
lockdep class are nested in the lock we have to nest to?
- This is making sure that mutex_unreserve_unlock have been called on all other locks.
Design:
I chose for ticket_mutex to encapsulate struct mutex, so the extra memory usage and
atomic set on init will only happen when you deliberately create a ticket lock.
Since the mutexes are mostly meant to protect buffer object serialization in ttm, not
much contention is expected. I could be slightly smarter with wakeups, but this would
be at the expense at adding a field to struct mutex_waiter. This would add
overhead to all cases where normal mutexes are used, and ticket_mutexes are less
performance sensitive anyway since they only protect buffer objects. As a result
I chose not to do this.
I needed this in kernel/mutex.c because of the extensions to __lock_common, which are
hopefully optimized away for all normal paths.
It is not illegal to use mutex_lock and mutex_unlock on ticket mutexes. This will work,
as long you don't mix lock/unlock calls. This is useful if you want to lock only a single
buffer.
All the mutex_reserve calls are nested into another lock. The idea is that the
seqno ticket you use functions as a lockdep class you have locked too. This will
prevent lockdep false positives on locking 2 objects of the same class. It's allowed
because they're nested to the seqno ticket. There are extensive tests for this in the
patch that introduces locking tests for reservation objects and reservation tickets.
Changes since RFC patch v1:
- Updated to use atomic_long instead of atomic, since the reservation_id was a long.
- added mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow
- removed mutex_locked_set_reservation_id (or w/e it was called)
Changes since RFC patch v2:
- remove use of __mutex_lock_retval_arg, add warnings when using wrong combination of
mutex_(,reserve_)lock/unlock.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@...onical.com>
[ 0.000000]
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] WARNING: at /c/kernel-tests/src/tip/kernel/mutex.c:386 __mutex_lock_common+0x5a9/0x870()
[ 0.000000] Hardware name: Bochs
[ 0.000000] Modules linked in:
[ 0.000000] Pid: 0, comm: swapper/0 Not tainted 3.8.0-rc7-00071-g11eb5a2 #180
[ 0.000000] Call Trace:
[ 0.000000] [<ffffffff81047102>] warn_slowpath_common+0xb2/0x120
[ 0.000000] [<ffffffff81047195>] warn_slowpath_null+0x25/0x30
[ 0.000000] [<ffffffff814ebdd9>] __mutex_lock_common+0x5a9/0x870
[ 0.000000] [<ffffffff81a66ef8>] ? rcu_cpu_notify+0xa8/0x451
[ 0.000000] [<ffffffff810bcb8f>] ? trace_hardirqs_off_caller+0xaf/0x120
[ 0.000000] [<ffffffff81a66ef8>] ? rcu_cpu_notify+0xa8/0x451
[ 0.000000] [<ffffffff810be38c>] ? lockdep_init_map+0xfc/0x230
[ 0.000000] [<ffffffff814ec621>] mutex_lock_nested+0x61/0x80
[ 0.000000] [<ffffffff810bcc1d>] ? trace_hardirqs_off+0x1d/0x30
[ 0.000000] [<ffffffff81a66ef8>] rcu_cpu_notify+0xa8/0x451
[ 0.000000] [<ffffffff814ecc31>] ? mutex_unlock+0x11/0x20
[ 0.000000] [<ffffffff81a3f952>] rcu_init+0x3b3/0x408
[ 0.000000] [<ffffffff81a21e0c>] start_kernel+0x34a/0x744
[ 0.000000] [<ffffffff81a216e6>] ? repair_env_string+0x81/0x81
[ 0.000000] [<ffffffff81a21120>] ? early_idt_handlers+0x120/0x120
[ 0.000000] [<ffffffff81a212fd>] x86_64_start_reservations+0x185/0x190
[ 0.000000] [<ffffffff81a214a8>] x86_64_start_kernel+0x1a0/0x1b6
[ 0.000000] ---[ end trace 8e966724b1809892 ]---
git bisect start 11eb5a2ce9655ee552f705f2da6b8f96f466c0d4 836dc9e3fbbab0c30aa6e664417225f5c1fb1c39 --
git bisect good 569104e21addbec215daa2d1d2c7b2a274ebcc27 # 1000 2013-02-15 00:30:51 drm/nouveau: share fence structures between nv10+ and nv50 implementations
git bisect good dacf4c29d7e3b2e0903fdef6da8296fe51132dca # 1000 2013-02-15 02:49:28 drm/nouveau/fifo/nvc0: improve interrupt handler somewhat
git bisect good 51fb70f904cf004272bd0a729959e4da76cb50dc # 1000 2013-02-15 05:08:37 drm/nouveau/gpio: pass number of on-die gpio lines to base
git bisect good 38bc1653a04eeea372a4249ef58d554ce38bd0f5 # 1000 2013-02-15 07:27:46 drm/nv84-/fence: abstract class emit/sync functions to virt+sequence
git bisect good ef80d98bdcaebca98ec9790ee584f903557ca149 # 1000 2013-02-15 09:47:41 drm/nouveau/fence: make internal hooks part of the context
git bisect good 5fdd378c20bb56824d57905591f1ec93a02513b9 # 1000 2013-02-15 12:06:18 arch: make __mutex_fastpath_lock_retval return whether fastpath succeeded or not.
git bisect good 5fdd378c20bb56824d57905591f1ec93a02513b9 # 3001 2013-02-15 18:58:19 arch: make __mutex_fastpath_lock_retval return whether fastpath succeeded or not.
git bisect good 075198d07b7b64b50afb54853dc390ded9962ee4 # 3000 2013-02-16 02:12:51 Revert "mutex: add support for reservation style locks"
git bisect good be03210f1bbdde34b397e86c115511667210fe98 # 3000 2013-02-18 05:35:58 Add linux-next specific files for 20130215
Thanks,
Fengguang
View attachment "dmesg-kvm-xbm-3830-2013-02-14-19-00-45-3.8.0-rc7-00071-g11eb5a2-180" of type "text/plain" (29906 bytes)
View attachment "11eb5a2-bisect.log" of type "text/plain" (30335 bytes)
View attachment ".config-bisect" of type "text/plain" (81978 bytes)
Powered by blists - more mailing lists