netdev - Re: [PATCH] ipv4: remove ip_rt

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 6 May 2010 15:54:43 -0400
From:	Neil Horman <nhorman@...driver.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	netdev@...r.kernel.org, davem@...emloft.net, kuznet@....inr.ac.ru,
	jmorris@...ei.org, yoshfuji@...ux-ipv6.org, kaber@...sh.net
Subject: Re: [PATCH] ipv4: remove ip_rt_secret timer

On Thu, May 06, 2010 at 08:33:33PM +0200, Eric Dumazet wrote:
> Le jeudi 06 mai 2010 à 14:02 -0400, Neil Horman a écrit :
> > On Thu, May 06, 2010 at 07:32:35PM +0200, Eric Dumazet wrote:
> > > Le jeudi 06 mai 2010 à 13:16 -0400, Neil Horman a écrit :
> > > > A while back there was a discussion regarding the rt_secret_interval timer.
> > > > Given that we've had the ability to do emergency route cache rebuilds for awhile
> > > > now, based on a statistical analysis of the various hash chain lengths in the
> > > > cache, the use of the flush timer is somewhat redundant.  This patch removes the
> > > > rt_secret_interval sysctl, allowing us to rely solely on the statistical
> > > > analysis mechanism to determine the need for route cache flushes.
> > > > 
> > > > Signed-off-by: Neil Horman <nhorman@...driver.com>
> > > > 
> > > > 
> > > 
> > > Nice cleanup try Neil, but this gives to attackers more time to hit the
> > > cache (infinite time should be enough as a matter of fact ;) )
> > > 
> > Not sure I follow what your complaint is.  I get that this gives attackers
> > plenty of time to try to attack the cache, but thats rather the point of the
> > statistics gathering for the cache, and why I don't think we need the secret
> > timer any more.   With the statistical analysis we do on the route cache every
> > gc cycle, we can tell if an attacker has guessed our rt_genid value, and is
> > making any chains in the cache abnormally long.  Thats when we do the rebuild,
> > modifying the rt_genid, forcing the attacker to re-discover it (which should be
> > difficult).  Theres no need to change this periodically if you're not being
> > attacked.
> >  
> > > Hints : 
> > > 
> > > - What is the initial value of rt_genid ?
> > > 
> > > - How/When is it changed (full 32 bits are changed or small
> > > perturbations ? check rt_cache_invalidate())
> > > 
> > /*
> >  * Pertubation of rt_genid by a small quantity [1..256]
> >  * Using 8 bits of shuffling ensure we can call rt_cache_invalidate()
> >  * many times (2^24) without giving recent rt_genid.
> >  * Jenkins hash is strong enough that litle changes of rt_genid are OK.
> >  */
> > static void rt_cache_invalidate(struct net *net)
> > {
> >         unsigned char shuffle;
> > 
> >         get_random_bytes(&shuffle, sizeof(shuffle));
> >         atomic_add(shuffle + 1U, &net->ipv4.rt_genid);
> > }
> > 
> > Clearly, its small changes.  To paraphrase the comment, Changes to rt_genid are
> > small enough to be confident that we don't repetatively use a gen_id often, but
> > sufficiently random that attackers cannot easily guess the next gen_id based on
> > the current value.  Both the timer and the statistics code use this invalidation
> > technique previously, and the latter continues to do so.
> > 
> > I've not changed anything regarding how we
> > invalidate, only when we choose to invalidate.  Invalidation can lead to
> > slowdowns during routing, since it send frames through the slow path after an
> > invalidation.  It behooves us to avoid preforming this invalidation without
> > need, and since we have a mechanism in place to do that invalidation specfically
> > when we need to, lets get rid of the code that handles that, and make it a bit
> > cleaner.  If there are users that feel strongly that they need to defend against
> > potential attacks by periodically changing their rt_genid, its still possible.
> > Its as simple as putting:
> > echo -1 > /proc/sys/net/ipv4/route/flush
> > in a cron job.
> > 
> 
> I have some customers that will simply kill me if their routing cache is
> disabled by a smart attack, slowing down their server by a 4x factor.
> 
> I know its possible, it has been done.
> 
Ok, I can understand that, but I don't think a single user profile needs to
dictate policy here.  the statistics code has a configurable threshold for where
to disable caching, so I would think you can just set that to be high.  And if
you need even more than that, you can do as I suggested, adding a:
echo -1 > /proc/sys/net/ipv4/route/flush
to a cron job on a short interval.  That will preform the _exact_ same operation
that the in-kernel timer did previously.  And the other 99% of our users don't
have to suffer a periodic cache invalidation when they don't need it.

> For a quiet machine possible rt_genid values range are known from
> attacker, and hash size is also known. Thats really too easy for the bad
> guys...
> 
Well, ok, I can buy your argument, but I think the problem you are describing is
orthogonoal to what my change here does.  If its too easy for attackers to guess
our genid secret, then we need to make it more complex to guess, not force
everyone to change it more frequently.

> Neil, I think your cleanup should stay a cleanup for the moment, or you
> must make sure rt_genid initial value is not 0 (read your patch
> again...)
> 
Yeah, I get that we start the gen_id at 0, that doesn't come from this change,
its always that way in the net_alloc initalization.  I certainly don't have a
problem during inialization starting with a randomization of that value.  I'll
post an updated patch shortly.

> I agree we dont need anymore the complex timer logic. We could keep the
> secret_interval (default to 0 if you really want) and force a
> rt_cache_invalidate() call once in a while from the periodic
> rt_check_expire() for example.
> 
Doing that doesn't solve my aim however, which is to avoid performing rt_genid
updates when no one is attacking you at all.  I completely agree that we can
start the gen_id at some random value (by forcing an initial invalidation),
however.  Beyond that however, if someone is managing to guess our secret value,
then we need to make our secret value more complex to determine.  Perhaps given
the reduction in the number of times we need to iterate our gen_id with the
timer gone, we can use something more heavyweight to determine the the hash
secret (the cprng perhaps?).

Regards
Neil
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html