[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20120914135931.262f6d0c.akpm@linux-foundation.org>
Date: Fri, 14 Sep 2012 13:59:31 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Vikram Mulukutla <markivx@...eaurora.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
linux-kernel@...r.kernel.org, sboyd@...eaurora.org
Subject: Re: [PATCH] lib: spinlock_debug: Avoid livelock in do_raw_spin_lock
On Thu, 13 Sep 2012 20:15:26 -0700
Vikram Mulukutla <markivx@...eaurora.org> wrote:
> The logic in do_raw_spin_lock attempts to acquire a spinlock
> by invoking arch_spin_trylock in a loop with a delay between
> each attempt. Now consider the following situation in a 2
> CPU system:
>
> 1. CPU-0 continually acquires and releases a spinlock in a
> tight loop; it stays in this loop until some condition X
> is satisfied. X can only be satisfied by another CPU.
>
> 2. CPU-1 tries to acquire the same spinlock, in an attempt
> to satisfy the aforementioned condition X. However, it
> never sees the unlocked value of the lock because the
> debug spinlock code uses trylock instead of just lock;
> it checks at all the wrong moments - whenever CPU-0 has
> locked the lock.
>
> Now in the absence of debug spinlocks, the architecture specific
> spinlock code can correctly allow CPU-1 to wait in a "queue"
> (e.g., ticket spinlocks), ensuring that it acquires the lock at
> some point. However, with the debug spinlock code, livelock
> can easily occur due to the use of try_lock, which obviously
> cannot put the CPU in that "queue". This queueing mechanism is
> implemented in both x86 and ARM spinlock code.
>
> Note that the situation mentioned above is not hypothetical.
> A real problem was encountered where CPU-0 was running
> hrtimer_cancel with interrupts disabled, and CPU-1 was attempting
> to run the hrtimer that CPU-0 was trying to cancel.
I'm surprised. Yes, I guess it will happen if both CPUs have local
interrupts disabled. And perhaps serialisation of the locked operation
helps.
> ...
>
> --- a/lib/spinlock_debug.c
> +++ b/lib/spinlock_debug.c
> @@ -107,23 +107,27 @@ static void __spin_lock_debug(raw_spinlock_t *lock)
> {
> u64 i;
> u64 loops = loops_per_jiffy * HZ;
> - int print_once = 1;
>
> - for (;;) {
> - for (i = 0; i < loops; i++) {
> - if (arch_spin_trylock(&lock->raw_lock))
> - return;
> - __delay(1);
> - }
> - /* lockup suspected: */
> - if (print_once) {
> - print_once = 0;
> - spin_dump(lock, "lockup suspected");
> + for (i = 0; i < loops; i++) {
> + if (arch_spin_trylock(&lock->raw_lock))
> + return;
> + __delay(1);
> + }
> + /* lockup suspected: */
> + spin_dump(lock, "lockup suspected");
> #ifdef CONFIG_SMP
> - trigger_all_cpu_backtrace();
> + trigger_all_cpu_backtrace();
> #endif
> - }
> - }
> +
> + /*
> + * In case the trylock above was causing a livelock, give the lower
> + * level arch specific lock code a chance to acquire the lock. We have
> + * already printed a warning/backtrace at this point. The non-debug arch
> + * specific code might actually succeed in acquiring the lock. If it is
> + * not successful, the end-result is the same - there is no forward
> + * progress.
> + */
> + arch_spin_lock(&lock->raw_lock);
> }
The change looks reasonable to me. I suggest we disambiguate that
comment a bit:
--- a/lib/spinlock_debug.c~lib-spinlock_debug-avoid-livelock-in-do_raw_spin_lock-fix
+++ a/lib/spinlock_debug.c
@@ -120,7 +120,7 @@ static void __spin_lock_debug(raw_spinlo
#endif
/*
- * In case the trylock above was causing a livelock, give the lower
+ * The trylock above was causing a livelock. Give the lower
* level arch specific lock code a chance to acquire the lock. We have
* already printed a warning/backtrace at this point. The non-debug arch
* specific code might actually succeed in acquiring the lock. If it is
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists