[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4EF030D5.6040603@iki.fi>
Date:	Tue, 20 Dec 2011 08:53:09 +0200
From:	Timo Teräs <timo.teras@....fi>
To:	David Miller <davem@...emloft.net>
CC:	steffen.klassert@...unet.com, netdev@...r.kernel.org
Subject: Re: linux-3.0.x regression with ipv4 routes having mtu
On 12/19/2011 11:10 PM, David Miller wrote:
> From: Steffen Klassert <steffen.klassert@...unet.com>
> Date: Fri, 16 Dec 2011 13:21:47 +0100
> 
>> Subject: [PATCH] route: Initialize with the fib_metrics in the non default case
>>
>> We initialize the routing metrics with the cached values in
>> rt_init_metrics(). So if we have the metrics cached on the
>> inetpeer, we ignore the user configured fib_metrics. So
>> initialize the routing metrics with the fib_metrics if they
>> are different from dst_default_metrics.
>>
>> Signed-off-by: Steffen Klassert <steffen.klassert@...unet.com>
> 
> The current behavior is intentional.
> 
> Learned metrics should be used on all routes for which a inetpeer
> peer exists and the destination matches.
> 
> There is no sane way to allow overrides.
> 
> I'm pretty sure all of Timo's bugs will be fixed when you add the
> generation count for PMTU stuff.
I tried to look at the code to see how the fib MTU is handled, but I
don't think just generation count for PMTU would solve it.
My problem is that after inetpeer is created, the fib mtu is never
looked again at. The code that updates it, is in rt_init_metrics():
                if (inet_metrics_new(peer))
                        memcpy(peer->metrics, fi->fib_metrics,
                               sizeof(u32) * RTAX_MAX);
Since the inetpeer there never gets recycled (peer lookup does not look
at generation count), the metrics are initialised from the fib exactly
once: when the inetpeer is initially created.
Now, if I have running system, there's traffic to specific inetpeer, and
later I add a system wide override route with mtu to that destination,
the updated mtu is never honoured. Because it comes from fib, and not
via the pmtu mechanism.
Or maybe I missed the place where that updated would happen?
It seems that the inetpeer.c comment that " The (inetpeer) nodes
contains long-living information about the peer which doesn't depend on
routes." does not hold true any more. Since mtu is (or at least used to
be) a route dependant value.
Perhaps we could then at least check the fib MTU and update inetpeer if
it's lower than what inetpeer used to be. This means of course that if
there's various routes to same destination (e.g. due to policy routing)
with different MTUs, only the smallest one would get used system wide.
But at least the route specific MTU would work then.
This is basically a problem for me, as I have userland code adding
dynamically per-destination mtu routes to workaround black hole ISP routers.
- Timo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists
 
