netdev - Re: [PATCH net-next 6/7] packet: rollover huge flows before small flows

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+FuTSfQ5vXQeo8FqU-hi3zTqu4hx+JqJmNbmRp95M0O3um1kA@mail.gmail.com>
Date:	Wed, 6 May 2015 16:06:34 -0400
From:	Willem de Bruijn <willemb@...gle.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Network Development <netdev@...r.kernel.org>,
	David Miller <davem@...emloft.net>
Subject: Re: [PATCH net-next 6/7] packet: rollover huge flows before small flows

On Wed, May 6, 2015 at 3:34 PM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> On Wed, 2015-05-06 at 14:27 -0400, Willem de Bruijn wrote:
>> From: Willem de Bruijn <willemb@...gle.com>
>>
>> Migrate flows from a socket to another socket in the fanout group not
>> only when the socket is full. Start migrating huge flows early, to
>> divert possible 4-tuple attacks without affecting normal traffic.
>>
>> Introduce fanout_flow_is_huge(). This detects huge flows, which are
>> defined as taking up more than half the load. It does so cheaply, by
>> storing the rxhashes of the N most recent packets. If over half of
>> these are the same rxhash as the current packet, then drop it. This
>> only protects against 4-tuple attacks. N is chosen to fit all data in
>> a single cache line.
>>
>> Tested:
>>   Ran bench_rollover for 10 sec with 1.5 Mpps of single flow input.
>>
>>       lpbb5:/export/hda3/willemb# ./bench_rollover -l 1000 -r -s
>>       cpu        rx       rx.k     drop.k   rollover     r.huge   r.failed
>>        0    1202599    1202599          0          0          0          0
>>        1    1221096    1221096          0          0          0          0
>>        2    1202296    1202296          0          0          0          0
>>        3    1229998    1229998          0          0          0          0
>>        4    1229551    1229551          0          0          0          0
>>        5    1221097    1221097          0          0          0          0
>>        6    1223496    1223496          0          0          0          0
>>        7    1616768    1616768          0    8530027    8530027          0
>>
>> Signed-off-by: Willem de Bruijn <willemb@...gle.com>
>> ---
>>  net/packet/af_packet.c | 30 +++++++++++++++++++++++++++---
>>  net/packet/internal.h  |  4 ++++
>>  2 files changed, 31 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
>> index d0c4c95..4e54b6b 100644
>> --- a/net/packet/af_packet.c
>> +++ b/net/packet/af_packet.c
>> @@ -1326,6 +1326,24 @@ static int fanout_rr_next(struct packet_fanout *f, unsigned int num)
>>       return x;
>>  }
>>
>> +static bool fanout_flow_is_huge(struct packet_sock *po, struct sk_buff *skb)
>> +{
>> +     u32 rxhash;
>> +     int i, count = 0;
>> +
>> +     rxhash = skb_get_hash(skb);
>> +     spin_lock(&po->rollover->hist_lock);
>> +     for (i = 0; i < ROLLOVER_HLEN; i++)
>> +             if (po->rollover->history[i] == rxhash)
>> +                     count++;
>> +
>> +     i = po->rollover->hist_idx++ & (ROLLOVER_HLEN - 1);
>> +     po->rollover->history[i] = rxhash;
>> +     spin_unlock(&po->rollover->hist_lock);
>> +
>> +     return count > (ROLLOVER_HLEN >> 1);
>> +}
>> +
>
> I am not a huge fan of this strategy or memory placement, because of the
> spinlock that protects something which should be a hint, more than an
> ultra precise decision. At the time lock is released, the status might
> already be imprecise.
>
> You touch 3 cache lines here, one for rollover->hist_idx++, one for
> history[i] = hash, and one for the spinlock.

Do you suggest running lockless, similar to the rfs table?
I can reduce history length to make hist_idx fit in the same
cacheline.

>
> (And following patch has the atomic_long_inc() for stats)
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html