[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1264836971.7499.4.camel@tonnant>
Date: Sat, 30 Jan 2010 02:36:11 -0500
From: Jon Masters <jonathan@...masters.org>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: linux-kernel <linux-kernel@...r.kernel.org>,
netdev <netdev@...r.kernel.org>,
netfilter-devel <netfilter-devel@...r.kernel.org>,
Patrick McHardy <kaber@...sh.net>
Subject: Re: debug: nt_conntrack and KVM crash
On Sat, 2010-01-30 at 07:58 +0100, Eric Dumazet wrote:
> Le vendredi 29 janvier 2010 à 20:59 -0500, Jon Masters a écrit :
> > On Fri, 2010-01-29 at 20:57 -0500, Jon Masters wrote:
> >
> > > Ah so I should have realized before but I wasn't looking at valid values
> > > for the range of the hashtable yet, nf_conntrack_htable_size is getting
> > > wildly out of whack. It goes from:
> > >
> > > (gdb) print nf_conntrack_hash_rnd
> > > $1 = 2688505299
> > > (gdb) print nf_conntrack_htable_size
> > > $2 = 16384
> > >
> > > nf_conntrack_events: 1
> > > nf_conntrack_max: 65536
> > >
> > > Shortly after booting, before being NULLed shortly after starting some
> > > virtual machines (the hash isn't reset, whereas it is recomputed if the
> > > hashtable is re-initialized after an intentional resizing operation):
> >
> > I mean the *seed* isn't changed, so I don't think it was resized
> > intentionally. I wonder where else htable_size is fiddled with.
> This rings a bell here, since another crash analysis on another problem
> suggested to me a potential problem with read_mostly and modules, but I
> had no time to confirm the thing yet.
>
> Could you try changing
>
>
> net/netfilter/nf_conntrack_core.c:57:unsigned int nf_conntrack_htable_size __read_mostly;
> to
> net/netfilter/nf_conntrack_core.c:57:unsigned int nf_conntrack_htable_size ;
I'll play later. Right now, I'm looking over every iptables/ip call
libvirt makes - it explicitly plays with the netns for the loopback,
which looks interesting. Supposing it does cause the hashtables to get
unintentionally zereod or the sizing to get wiped out, we should also
nonetheless catch the case that the hash function generates a whacko
number or that the hash size is set to zero when we want to use it.
Amazing more people aren't talking about this, happens on several Fedora
boxes that I know of, and I'm sure many more too using KVM+nf.
Jon.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists