[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <45E6C2F2.8090007@redhat.com>
Date: Thu, 01 Mar 2007 07:11:30 -0500
From: Prarit Bhargava <prarit@...hat.com>
To: Andrew Morton <akpm@...ux-foundation.org>
CC: linux-kernel@...r.kernel.org, jbeulich@...ell.com,
anil.s.keshavamurthy@...el.com, amul.shah@...sys.com
Subject: Re: [PATCH]: Use stop_machine_run in the Intel RNG driver
> I think what you're describing here is just the standard old
> smp_call_function() deadlock, rather than anything which is specific to
> intel-rng, yes?
>
> It is "well known" that you can't call smp_call_function() with local
> interrupts disabled. In fact i386 will spit a warning if you try it.
>
>
> intel-rng doesn't do that, but what it _does_ do is:
>
> smp_call_function(..., wait = 0);
> local_irq_disable();
>
> so some CPUs will still be entering the IPI while this CPU has gone and
> disabled interrupts, thus exposing us to the deadlock, yes?
>
Not quite Andrew. This was a different and little more complicated.
The deadlock occurs because two CPUs are in contention over a rw_lock
and the "call_function" puts the CPUs in such a state that no forward
progress will be made until the calling CPU has completed it's code.
Here's a more detailed example (sorry for the cut-and-paste):
1. CPU A has done read_lock(&lock), and has acquired the lock.
2. CPU B has done write_lock_irq(&lock) and is waiting for A to release
the lock. CPU B has disabled interrupts while waiting for the interrupt:
void __lockfunc _write_lock_irq(rwlock_t *lock)
{
local_irq_disable();
preempt_disable();
rwlock_acquire(&lock->dep_map, 0, 0, _RET_IP_);
_raw_write_lock(lock);
}
3. CPU C issues smp_call_function, as in the case of the intel-rng driver:
set_mb(waitflag, 1);
smp_call_function(intel_init_wait, NULL, 1, 0);
...
// do some stuff with interrupts disabled
...
set_mb(waitflag, 0);
where
static char __initdata waitflag;
static void __init intel_init_wait(void *unused)
{
while (waitflag)
cpu_relax();
}
In this code the calling processor, C, has issued an IPI and disabled
interrupts on every processor except itself. When each processor takes
the IPI it runs intel_init_wait and waits in a tight loop until
waitflag is zero. ie) no forward progress on any CPU.
CPU C will not execute the code below the smp_call_function until all
processors have started (not completed!) the IPI function. From
call_smp_function:
cpus = num_online_cpus() - 1;
...
/* Send a message to all other CPUs and wait for them to respond */
send_IPI_allbutself(CALL_FUNCTION_VECTOR);
/* Wait for response */
while (atomic_read(&data.started) != cpus)
cpu_relax();
So CPU C is waiting here.
4. CPU A, which holds the lock sees the IPI and is in the
intel_init_wait code, happily waiting. CPU A has incremented
data.started. CPU A will stay in this loop until CPU C sets waitflag = 0.
5. CPU B, if you recall is _waiting with interrupts disabled_ for CPU A
to release the lock. It does not see the IPI because it has interrupts
disabled. It will not see the IPI until CPU A has released the lock.
6. CPU C is eventually only waiting for CPU B to do the final increment
of data.started = cpus. CPU B is waiting for CPU A to release the
lock. CPU A is executing a tight loop which it will not exit from until
CPU C can set waitflag to zero.
That's a 3-way deadlock.
So, the issue is placing the other CPUs in a state that they do not make
forward progress. The deadlock occurs before the calling CPU has
disabled interrupts in the code in step 3.
I also tested this code without the __init tags and explicitly coding
waitflag=0 to avoid the gcc only setting .bss section variables to zero
error that someone fixed last week. I also removed the code that
disabled interrupts on the calling processor which had no effect -- at
first I thought it was a simple interrupt issue ...
Maybe smp_call_function needs a written warning that the called function
should not "suspend" CPUs?
P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists