[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090330130532.1e433313@nehalam>
Date: Mon, 30 Mar 2009 13:05:32 -0700
From: Stephen Hemminger <shemminger@...tta.com>
To: Eric Dumazet <dada1@...mosbay.com>
Cc: Patrick McHardy <kaber@...sh.net>,
David Miller <davem@...emloft.net>,
Rick Jones <rick.jones2@...com>, netdev@...r.kernel.org,
netfilter-devel@...r.kernel.org
Subject: Re: [PATCH] netfilter: finer grained nf_conn locking
On Mon, 30 Mar 2009 21:57:15 +0200
Eric Dumazet <dada1@...mosbay.com> wrote:
> Stephen Hemminger a écrit :
> > On Sat, 28 Mar 2009 17:55:38 +0100
> > Eric Dumazet <dada1@...mosbay.com> wrote:
> >
> >> Eric Dumazet a écrit :
> >>> Patrick McHardy a écrit :
> >>>> Stephen Hemminger wrote:
> >>>>
> >>>>> @@ -50,6 +50,7 @@ struct ip_ct_tcp_state {
> >>>>>
> >>>>> struct ip_ct_tcp
> >>>>> {
> >>>>> + spinlock_t lock;
> >>>>> struct ip_ct_tcp_state seen[2]; /* connection parameters per
> >>>>> direction */
> >>>>> u_int8_t state; /* state of the connection (enum
> >>>>> tcp_conntrack) */
> >>>>> /* For detecting stale connections */
> >>>> Eric already posted a patch to use an array of locks, which is
> >>>> a better approach IMO since it keeps the size of the conntrack
> >>>> entries down.
> >>> Yes, we probably can use an array for short lived lock sections.
> >
> > I am not a fan of the array of locks. Sizing it is awkward and
> > it is vulnerable to hash collisions. Let's see if there is another
> > better way.
>
> On normal machines, (no debugging spinlocks), patch uses an embedded
> spinlock. We probably can use this even on 32bit kernels, considering
> previous patch removed the rcu_head (8 bytes on 32bit arches) from
> nf_conn :)
>
> if LOCKDEP is on, size of a spinlock is 64 bytes on x86_64.
> Adding a spinlock on each nf_conn would be too expensive. In this
> case, an array of spinlock is a good compromise, as done in
> IP route cache, tcp ehash, ...
>
> I agree sizing of this hash table is not pretty, and should be
> a generic kernel service (I wanted such service for futexes for example)
>
IMO having different locking based on lockdep and architecture is an invitation
to future obscure problems. Perhaps some other locking method or shrinking
ct entry would be better.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists