[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1275368732.2478.88.camel@edumazet-laptop>
Date: Tue, 01 Jun 2010 07:05:32 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Changli Gao <xiaosuo@...il.com>
Cc: hawk@...x.dk, Jesper Dangaard Brouer <hawk@...u.dk>,
paulmck@...ux.vnet.ibm.com, Patrick McHardy <kaber@...sh.net>,
Linux Kernel Network Hackers <netdev@...r.kernel.org>,
Netfilter Developers <netfilter-devel@...r.kernel.org>
Subject: Re: DDoS attack causing bad effect on conntrack searches
Le mardi 01 juin 2010 à 08:28 +0800, Changli Gao a écrit :
> On Tue, Jun 1, 2010 at 5:21 AM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> >
> > I had a look at current conntrack and found the 'unconfirmed' list was
> > maybe a candidate for a potential blackhole.
> >
> > That is, if a reader happens to hit an entry that is moved from regular
> > hash table slot 'hash' to unconfirmed list,
>
> Sorry, but I can't find where we do this things. unconfirmed list is
> used to track the unconfirmed cts, whose corresponding skbs are still
> in path from the first to the last netfilter hooks. As soon as the
> skbs end their travel in netfilter, the corresponding cts will be
> confirmed(moving ct from unconfirmed list to regular hash table).
>
So netfilter is a monolithic thing.
When a packet begins its travel into netfilter, you guarantee that no
other packet can also begin its travel and find an unconfirmed
conntrack ?
I wonder why we use atomic ops then to track the confirmed bit :)
> unconfirmed list should be small, as networking receiving is in BH.
So according to you, netfilter/ct runs only in input path ?
So I assume a packet is handled by CPU X, creates a new conntrack
(possibly early droping an old entry that was previously in a standard
hash chain), inserted in unconfirmed list. _You_ guarantee another CPU
Y, handling another packet, possibly sent by a hacker reading your
netdev mails, cannot find the conntrack that was early dropped ?
> How about implementing unconfirmed list as a per cpu variable?
I first implemented such a patch to reduce cache line contention, then I
asked to myself : What is exactly an unconfirmed conntrack ? Can their
number be unbounded ? If yes, we have a problem, even on a two cpus
machine. Using two lists instead of one wont solve the fundamental
problem.
The real question is, why do we need this unconfirmed 'list' in the
first place. Is it really a private per cpu thing ? Can you prove this,
in respect of lockless lookups, and things like NFQUEUE ?
Each conntrack object has two list anchors. One for IP_CT_DIR_ORIGINAL,
one for IP_CT_DIR_REPLY.
Unconfirmed list use the first anchor. This means another cpu can
definitely find an unconfirmed item in a regular hash chain, since we
dont respect an RCU grace period before re-using an object.
If memory was not a problem, we probably would use a third anchor to
avoid this, or regular RCU instead of SLAB_DESTROY_BY_RCU variant.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists