[<prev] [next>] [day] [month] [year] [list]
Message-ID: <4A55222E.5030405@easycrypt.de>
Date: Thu, 09 Jul 2009 00:48:14 +0200
From: Timm Korte <korte-kernel@...ycrypt.de>
To: lkml <linux-kernel@...r.kernel.org>
Subject: "impossible" spinlock "wrong CPU" problem with custom device driver
I'm trying to understand a spinlog bug in a kernel module (device driver).
I have a spinlock that is uses in the actual hardware interrupt handler
as well as in a seperate kernel thread doing the real work via a work
queue. The first one uses the spinlock with spin_lock() and
spin_unlock(), while the thread uses spin_lock_irqsave() and
spin_unlock_irqrestore().
On rare occasions (can't reproduce on purpose), i get a spinlog debug
message about wrong cpu on _raw_spin_unlock when called from the kernel
thread.
This is the source (for the kernel_thread) that runs into the problem:
static int my_irqthread_function(void *ptr) {
struct my_dev *mydev = ptr;
daemonize(MY_NAME "%02x", mydev->mynum);
allow_signal(SIGTERM);
while (!wait_event_interruptible(mydev->irqthread_wait,
atomic_read(&mydev->irqthread_pending_count))) {
do {
uint8_t my_irq_pending = 0;
unsigned long iflags;
spin_lock_irqsave(&mydev->irq_pending_lock, iflags);
my_irq_pending = mydev->irq_pending;
mydev->irq_pending = 0;
spin_unlock_irqrestore(&mydev->irq_pending_lock, iflags);
// handle irqs
if (my_irq_pending & INT_IPAC1) {
my_handle_interrupt(&mydev->mydev[IPAC1]);
}
...
// continue if the pending count still is != 0 after decrementing
} while (!atomic_dec_and_test(&mydev->irqthread_pending_count));
}
mydev->irqthread = 0;
complete_and_exit(&mydev->irqthread_exit, 0);
}
The error (SPIN_BUG with kernel panic on my SMP box) happens on the
"spin_unlock_irqrestore(&mydev->irq_pending_lock, iflags);" - but i
really can't figure out, how the thread could be moved to another cpu,
while holding the lock and only doing two assignment operations.
The only thing i could think of, is that it might have something to do
with the enabled sigterm signal - even though the module wasn't being
unloaded at the time the bug occured.
System is FC4 based with a 2.6.17 kernel (can't change).
So I'm sort of out of ideas and hope someone here has an idea, what
might have gone wrong here.
Timm
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists