[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A390CF4.7060909@trash.net>
Date: Wed, 17 Jun 2009 17:34:12 +0200
From: Patrick McHardy <kaber@...sh.net>
To: Eric Dumazet <eric.dumazet@...il.com>
CC: Ingo Molnar <mingo@...e.hu>, David Miller <davem@...emloft.net>,
Thomas Gleixner <tglx@...utronix.de>,
torvalds@...ux-foundation.org, akpm@...ux-foundation.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [bug] __nf_ct_refresh_acct(): WARNING: at lib/list_debug.c:30
__list_add+0x7d/0xad()
Eric Dumazet wrote:
> Patrick McHardy a écrit :
>> I'm having some trouble figuring out the exact events that would
>> lead to the timer base corruption. Ingo, could you please test
>> this patch to make sure it also fixes the problem?
>
> ;)
>
> Event can be described as following :
>
> CPU1 CPU2
>
> /* __nf_conntrack_confirm() */
> __nf_conntrack_hash_insert(ct, hash, repl_hash);
> // now 'ct' is visible by other cpus
> // search conntrack and find ct
> // timeout.expires becomes absolute here
> ct->timeout.expires += jiffies;
> add_timer(&ct->timeout);
>
> /* __nf_ct_refresh_acct() */
> if (!nf_ct_is_confirmed(ct)) {
> // we *believe* timeout.expires
> // is not yet in use by timer code
> // and is still a relative quantity.
> // We want to 'update' it but we should not !
> ct->timeout.expires = extra_jiffies; << CORRUPTION >>
> } else {
> // too late :(
> set_bit(IPS_CONFIRMED_BIT, &ct->status);
>
> This is how I understood the problem, but I may be wrong ?
Thats one case that can happen, but that wouldn't corrupt the
timer base AFAICS. Also the callpath shows that it actually went
into the mod_timer_pending() path *and* timer_pending() was true.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists