lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID:
 <PH0PR10MB4504888284FF4CBA648197D0ACB82@PH0PR10MB4504.namprd10.prod.outlook.com>
Date: Mon, 21 Apr 2025 01:10:40 +0000
From: Venkat Venkatsubra <venkat.x.venkatsubra@...cle.com>
To: "davem@...emloft.net" <davem@...emloft.net>,
        "netdev@...r.kernel.org"
	<netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>,
        "edumazet@...gle.com" <edumazet@...gle.com>,
        "kuba@...nel.org" <kuba@...nel.org>
CC: "razor@...ckwall.org" <razor@...ckwall.org>
Subject: IP de-fragmentation failing on bridge

A brief problem description.

ping from a VM interface with mtu 9000 fails:

# ping -c 1 -s 9100 192.168.16.124
PING 192.168.16.124 (192.168.16.124) 9100(9128) bytes of data.
1 packet transmitted, 0 received, 100% packet loss

On the host they arrive as 2 fragments:

frag1 iplen 8996
frag2 iplen  152

They are passed to the bridge.
bridge-nf-call-iptables is enabled.

# cat /proc/sys/net/bridge/bridge-nf-call-iptables
1

The bridge's mtu is 9000.

# ip link show dev privnet
11: privnet: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000

It needs to be forwarded over icbond0 which also has mtu 9000.

# ip link show dev icbond0
10: icbond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000

At the time of defragmentation, the bridge drops the reassembled IP packet since it finds 
frag_max_size (8996) exceeding the "mtu" which it thinks is 1500.

Prior to
ac6627a2 net: ipv4: Consolidate ipv4_mtu and ip_dst_mtu_maybe_forward

the bridge was getting the "mtu" from its "fake_mtu" dst_ops.
That returns the interface mtu. 9000 in this case. Which was good for us.
But post that patch it now depends on the dst metric for RTAX_MTU.

        /* 'forwarding = true' case should always honour route mtu */
        mtu = dst_metric_raw(dst, RTAX_MTU);

>From dst_metric_raw we get 1500 which is the default set by the bridge.

static const u32 br_dst_default_metrics[RTAX_MAX] = {
        [RTAX_MTU - 1] = 1500,
};

Since the bridge sets the metrics as read_only, this metric doesn't seem to be reflecting the true mtu.
Which is larger (9000) in our case.

Is this already a resolved issue ? 
Based on the latest bridge code I couldn't find a match for a fix.

If we want to retain pre-ac6627a2 behavior, 
would keeping 0 as the "fake" RTAX MTU be a viable option ?
i.e.
       [RTAX_MTU - 1] = 0,
instead of
       [RTAX_MTU - 1] = 1500,

Thanks for your help.

Thanks,
Venkat

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ