netdev - Re: [net-next PATCH 2/3] net: fix enforcing of fragment queue hash list depth

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1366366287.3205.98.camel@edumazet-glaptop>
Date:	Fri, 19 Apr 2013 03:11:27 -0700
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Jesper Dangaard Brouer <brouer@...hat.com>
Cc:	"David S. Miller" <davem@...emloft.net>,
	Hannes Frederic Sowa <hannes@...essinduktion.org>,
	netdev@...r.kernel.org
Subject: Re: [net-next PATCH 2/3] net: fix enforcing of fragment queue hash
 list depth

On Thu, 2013-04-18 at 23:38 +0200, Jesper Dangaard Brouer wrote:
> I have found an issues with commit:
> 
>  commit 5a3da1fe9561828d0ca7eca664b16ec2b9bf0055
>  Author: Hannes Frederic Sowa <hannes@...essinduktion.org>
>  Date:   Fri Mar 15 11:32:30 2013 +0000
> 
>     inet: limit length of fragment queue hash table bucket lists
> 
> There is a connection between the fixed 128 hash depth limit and the
> frag mem limit/thresh settings, which limits how high the thresh can
> be set.
> 
> The 128 elems hash depth limit, results in bad behaviour if mem limit
> thresh holds are increased, via /proc/sys/net ::
> 
>  /proc/sys/net/ipv4/ipfrag_high_thresh
>  /proc/sys/net/ipv4/ipfrag_low_thresh
> 
> If we increase the thresh, to something allowing 128 elements in each
> bucket, which is not that high given the hash array size of 64
> (64*128=8192), e.g.
>   big MTU frags (2944(truesize)+208(ipq))*8192(max elems)=25755648
>   small frags   ( 896(truesize)+208(ipq))*8192(max elems)=9043968
> 
> The problem with commit 5a3da1fe (inet: limit length of fragment queue
> hash table bucket lists) is that, once we hit the limit, the we *keep*
> the existing frag queues, not allowing new frag queues to be created.
> Thus, an attacker can effectivly block handling of fragments for 30
> sec (as each frag queue have a timeout of 30 sec).
> 
> Even without increasing the limit, as Hannes showed, an attacker on
> IPv6 can "attack" a specific hash bucket, and via that change, can
> block/drop new fragments also (trying to) utilize this bucket.
> 
> Summary:
>  With the default mem limit/thresh settings, this is not general
> problem, but adjusting the thresh limits result in some-what
> unexpected behavior.
> 
> Proposed solution:
>  IMHO instead of keeping existing frag queues, we should kill one of
> the frag queues in the hash instead.


This strategy wont really help DDOS attacks. No frag will ever complete.

I am not sure its worth adding extra complexity.

> 
> Implementation complications:
>  Killing of frag queues while only holding the hash bucket lock, and
> not the frag queue lock, complicates the implementation, as we race
> and can end up (trying to) remove the hash element twice (resulting in
> an oops). This have been addressed by using hlist_del_init() and a
> hlist_unhashed() check in fq_unlink_hash().
> 
> Extra:
> * Added new sysctl "max_hash_depth" option, to allow users to adjust the hash
>   depth along with adjusting the thresh limits.
> * Change max hash depth to 32, thus limit handling to approx 2048 frag queues.
> 
> Signed-off-by: Jesper Dangaard Brouer <brouer@...hat.com>
> ---
> 
>  include/net/inet_frag.h                 |    9 +---
>  net/ipv4/inet_fragment.c                |   64 ++++++++++++++++++++-----------
>  net/ipv4/ip_fragment.c                  |   13 +++++-
>  net/ipv6/netfilter/nf_conntrack_reasm.c |    5 +-
>  net/ipv6/reassembly.c                   |   15 ++++++-
>  5 files changed, 68 insertions(+), 38 deletions(-)

Hmm... adding a new sysctl without documentation is a clear sign you'll
be the only user of it.

You are also setting a default limit of 32, more likely to hit the
problem than current 128 value.

We know the real solution is to have a correctly sized hash table, so
why adding a temporary sysctl ?

As soon as /proc/sys/net/ipv4/ipfrag_high_thresh is changed, a resize
should be attempted.

But the max depth itself should be a reasonable value, and doesn't need
to be tuned IMHO.

The 64 slots hash table was chosen years ago, when machines had 3 order
of magnitude less ram than today.

Before hash resizing, I would just bump hash size to something more
reasonable like 1024.

That would allow some admin to set /proc/sys/net/ipv4/ipfrag_high_thresh
to a quite large value.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html