netdev - Re: IP fragmentation broken in 3.6-rc ?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1345582019.5158.547.camel@edumazet-glaptop>
Date:	Tue, 21 Aug 2012 22:46:59 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Julian Anastasov <ja@....bg>
Cc:	Sylvain Munaut <s.munaut@...tever-company.com>,
	netdev@...r.kernel.org
Subject: Re: IP fragmentation broken in 3.6-rc ?

On Tue, 2012-08-21 at 23:00 +0300, Julian Anastasov wrote:
> 	Hello,
> 
> On Tue, 21 Aug 2012, Eric Dumazet wrote:
> 
> > In fact we re-enter ip_rt_update_pmtu() if we receive a second ICMP,
> > and rt->rt_pmtu is already set, but dst is expired.
> > 
> > Thats why Sylvain said it was not happening in the 10 minutes following
> > boot.
> > 
> > So calling again dst_set_expires(&rt->dst, ip_rt_mtu_expires) does
> > nothing : rt_pmtu is ignored because dst.expires is too old.
> 
> 	Oh, well. I thought dst_set_expires was used
> before for IPv4, it was not. I didn't expected function
> that was old to cause problem. So, you are right that
> MTU is not updated second time. But we must check if
> such change will harm IPv6, dst_set_expires was/is
> used there.
> 
> > Maybe we should just do :
> > 
> > diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> > index e4ba974..d0181e2 100644
> > --- a/net/ipv4/route.c
> > +++ b/net/ipv4/route.c
> > @@ -952,7 +952,7 @@ static void ip_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
> >  	ip_rt_build_flow_key(&fl4, sk, skb);
> >  	mtu = __ip_rt_update_pmtu(rt, &fl4, mtu);
> >  
> > -	if (!rt->rt_pmtu) {
> > +	if (!rt->rt_pmtu || time_after_eq(jiffies, rt->dst.expires)) {
> 
> 	At first look it should not be needed to cause route
> relookup here. tcp_v4_mtu_reduced is prepared to use the new
> dst_mtu for mss after calling inet_csk_update_pmtu. It seems the
> new code expects that every socket that performs PMTU Discovery
> will get ICMP error, so tcp_v4_mtu_reduced should be called
> for every socket with every new ICMP. If there are 100 sockets
> we will get 100 ICMPs. If we set DST_OBSOLETE_KILL as you
> propose, all sockets will notice the need to relookup
> route and will learn the new PMTU from fnhe exception with less
> chances for other ICMP events. At first look, this variant looks
> better to me in case if we want single PMTU to be propagated
> to all sockets immediately. Can it cause other problems?
> Now the rt life with PMTU will be limited to 600 secs.
> 
> 	OTOH, tcp_current_mss() is called often, it will
> notice the new dst_mtu, so may be there is no need for
> route relookup by setting DST_OBSOLETE_KILL?
> 
> 	Not sure if we still have to support this second
> option, to change dst_set_expires to check for timeout=0:
> 
> 	if (dst->expires == 0 || time_after(expires, dst->expires) ||
> 	    !timeout)
> 
> 	In IPv4, ip_rt_update_pmtu was the only place that
> can extend dst->expires, for the rt_bind_exception case
> I assume expires is still 0 before the checks. Not sure if
> IPv6 needs to use time_after, may be it prefers the
> time_before variant? IPv6 updates MTU in rt6_update_expires
> and we call dst_set_expires where the timer will not be
> extended. But may be the intention is that MTU only
> can reduce the expiration, not to extend it after timer
> was set when route was added.
> 
> 	So, another option is to create new function
> dst_update_expires and to use it just for IPv4, it will
> use time_after because IPv4 uses the timer just for
> PMTU while IPv6 uses it in ip6_route_add() to limit
> route lifetime due to lft values.
> 
> 	dst_update_expires will be like dst_set_expires
> but with this difference (time_after):
> 
> 	if (dst->expires == 0 || time_after(expires, dst->expires) ||
> 	    !timeout)

Hmm, all these tests are not really needed, what about :

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index e4ba974..8d6d320 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -956,7 +956,7 @@ static void ip_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
 		dst->obsolete = DST_OBSOLETE_KILL;
 	} else {
 		rt->rt_pmtu = mtu;
-		dst_set_expires(&rt->dst, ip_rt_mtu_expires);
+		rt->dst.expires = max(1UL, jiffies + ip_rt_mtu_expires);
 	}
 }
 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html