[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110524213327.GA3917@dev1756.snc6.facebook.com>
Date: Tue, 24 May 2011 14:33:27 -0700
From: Arun Sharma <asharma@...com>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: Arun Sharma <asharma@...com>,
Maximilian Engelhardt <maxi@...monizer.de>,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
StuStaNet Vorstand <vorstand@...sta.mhn.de>
Subject: Re: Kernel crash after using new Intel NIC (igb)
On Thu, May 12, 2011 at 11:15:53PM +0200, Eric Dumazet wrote:
>
> Probably not.
>
> What gives slub_nomerge=1 for you ?
>
It took me a while to get a new kernel on a large enough sample
of machines to get some data.
Like you observed in the other thread, this is unlikely to be a random
memory corruption.
The panics stopped after we moved the list_empty() check under the lock.
--- a/net/ipv4/inetpeer.c
+++ b/net/ipv4/inetpeer.c
@@ -154,11 +154,11 @@ void __init inet_initpeers(void)
/* Called with or without local BH being disabled. */
static void unlink_from_unused(struct inet_peer *p)
{
+ spin_lock_bh(&unused_peers.lock);
if (!list_empty(&p->unused)) {
- spin_lock_bh(&unused_peers.lock);
list_del_init(&p->unused);
- spin_unlock_bh(&unused_peers.lock);
}
+ spin_unlock_bh(&unused_peers.lock);
}
static int addr_compare(const struct inetpeer_addr *a,
The idea being that the list gets corrupted under some kind of a race
condition. Two threads racing on list_empty() and executing
list_del_init() seems harmless.
There is probably a different race condition that is mitigated by doing
the list_empty() check under the lock.
-Arun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists