[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1430940864.14545.80.camel@edumazet-glaptop2.roam.corp.google.com>
Date: Wed, 06 May 2015 12:34:24 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: Willem de Bruijn <willemb@...gle.com>
Cc: netdev@...r.kernel.org, davem@...emloft.net
Subject: Re: [PATCH net-next 6/7] packet: rollover huge flows before small
flows
On Wed, 2015-05-06 at 14:27 -0400, Willem de Bruijn wrote:
> From: Willem de Bruijn <willemb@...gle.com>
>
> Migrate flows from a socket to another socket in the fanout group not
> only when the socket is full. Start migrating huge flows early, to
> divert possible 4-tuple attacks without affecting normal traffic.
>
> Introduce fanout_flow_is_huge(). This detects huge flows, which are
> defined as taking up more than half the load. It does so cheaply, by
> storing the rxhashes of the N most recent packets. If over half of
> these are the same rxhash as the current packet, then drop it. This
> only protects against 4-tuple attacks. N is chosen to fit all data in
> a single cache line.
>
> Tested:
> Ran bench_rollover for 10 sec with 1.5 Mpps of single flow input.
>
> lpbb5:/export/hda3/willemb# ./bench_rollover -l 1000 -r -s
> cpu rx rx.k drop.k rollover r.huge r.failed
> 0 1202599 1202599 0 0 0 0
> 1 1221096 1221096 0 0 0 0
> 2 1202296 1202296 0 0 0 0
> 3 1229998 1229998 0 0 0 0
> 4 1229551 1229551 0 0 0 0
> 5 1221097 1221097 0 0 0 0
> 6 1223496 1223496 0 0 0 0
> 7 1616768 1616768 0 8530027 8530027 0
>
> Signed-off-by: Willem de Bruijn <willemb@...gle.com>
> ---
> net/packet/af_packet.c | 30 +++++++++++++++++++++++++++---
> net/packet/internal.h | 4 ++++
> 2 files changed, 31 insertions(+), 3 deletions(-)
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index d0c4c95..4e54b6b 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -1326,6 +1326,24 @@ static int fanout_rr_next(struct packet_fanout *f, unsigned int num)
> return x;
> }
>
> +static bool fanout_flow_is_huge(struct packet_sock *po, struct sk_buff *skb)
> +{
> + u32 rxhash;
> + int i, count = 0;
> +
> + rxhash = skb_get_hash(skb);
> + spin_lock(&po->rollover->hist_lock);
> + for (i = 0; i < ROLLOVER_HLEN; i++)
> + if (po->rollover->history[i] == rxhash)
> + count++;
> +
> + i = po->rollover->hist_idx++ & (ROLLOVER_HLEN - 1);
> + po->rollover->history[i] = rxhash;
> + spin_unlock(&po->rollover->hist_lock);
> +
> + return count > (ROLLOVER_HLEN >> 1);
> +}
> +
I am not a huge fan of this strategy or memory placement, because of the
spinlock that protects something which should be a hint, more than an
ultra precise decision. At the time lock is released, the status might
already be imprecise.
You touch 3 cache lines here, one for rollover->hist_idx++, one for
history[i] = hash, and one for the spinlock.
(And following patch has the atomic_long_inc() for stats)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists