[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f87d7615-a978-4f9e-f124-635789fa5874@novek.ru>
Date: Thu, 13 Oct 2022 01:43:17 +0100
From: Vadim Fedorenko <vfedorenko@...ek.ru>
To: Vyacheslav Salnikov <snordicstr16@...il.com>,
netfilter-devel@...r.kernel.org, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org
Subject: Re: bridge:fragmented packets dropped by bridge
On 07.10.2022 12:21, Vyacheslav Salnikov wrote:
> Hi.
>
> I switched from kernel versions 4.9 to 5.15 and found that the MTU on
> the interfaces in the bridge does not change.
> For example:
> I have the following bridge:
> bridge interface
> br0 sw1
> sw2
> sw3
>
> And I change with ifconfig MTU.
> I see that br0 sw1..sw3 has changed MTU from 1500 -> 1982.
>
> But if i send a ping through these interfaces, I get 1500(I added
> prints for output)
> I investigated the code and found the reason:
> The following commit came in the new kernel:
> https://github.com/torvalds/linux/commit/ac6627a28dbfb5d96736544a00c3938fa7ea6dfb
>
> And the behavior of the MTU setting has changed:
>>
>> Kernel 4.9:
>> if (net->ipv4.sysctl_ip_fwd_use_pmtu ||
>> ip_mtu_locked(dst) ||
>> !forwarding) <--- True
>> return dst_mtu(dst) <--- 1982
>>
>>
>> / 'forwarding = true' case should always honour route mtu /
>> mtu = dst_metric_raw(dst, RTAX_MTU);
>> if (mtu)
>> return mtu;
>
>
>
> Kernel 5.15:
>>
>> if (READ_ONCE(net->ipv4.sysctl_ip_fwd_use_pmtu) ||
>> ip_mtu_locked(dst) ||
>> !forwarding) { <--- True
>> mtu = rt->rt_pmtu; <--- 0
>> if (mtu && time_before(jiffies, rt->dst.expires)) <-- False
>> goto out;
>> }
>>
>> / 'forwarding = true' case should always honour route mtu /
>> mtu = dst_metric_raw(dst, RTAX_MTU); <---- 1500
>> if (mtu) <--- True
>> goto out;
>
> As I see from the code in the end takes mtu from br_dst_default_metrics
>> static const u32 br_dst_default_metrics[RTAX_MAX] = {
>> [RTAX_MTU - 1] = 1500,
>> };
>
> Why is rt_pmtu now used instead of dst_mtu?
> Why is forwarding = False called with dst_metric_raw?
> Maybe we should add processing when mtu = rt->rt_pmtu == 0?
> Could this be an error?
>
If you compare ipv4_mtu code from 4.9 you will see that the very first mtu value
is filled by rt->rt_pmtu value. I believe there were changes to the bridge code
where rt_pmtu value got empty or cleared.
I'm still looking for the root cause of the problem, will update you once I find it.
>
> I found a thread discussing a similar problem. It suggested porting
> the next patch:
> Signed-off-by: Rundong Ge <rdong.ge@...il.com>
> ---
> include/net/ip.h | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/include/net/ip.h b/include/net/ip.h
> index 29d89de..0512de3 100644
> --- a/include/net/ip.h
> +++ b/include/net/ip.h
> @@ -450,6 +450,8 @@ static inline unsigned int
> ip_dst_mtu_maybe_forward(const struct dst_entry *dst,
> static inline unsigned int ip_skb_dst_mtu(struct sock *sk,
> const struct sk_buff *skb)
> {
> + if ((skb_dst(skb)->flags & DST_FAKE_RTABLE) && skb->dev)
> + return min(skb->dev->mtu, IP_MAX_MTU);
> if (!sk || !sk_fullsock(sk) || ip_sk_use_pmtu(sk)) {
> bool forwarding = IPCB(skb)->flags & IPSKB_FORWARDED;
>
>
> Why was this patch not accepted in the end?
Powered by blists - more mailing lists