[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210902174845.GE3350910@belle.intranet.vanheusden.com>
Date: Thu, 2 Sep 2021 19:48:45 +0200
From: folkert <folkert@...heusden.com>
To: Florian Westphal <fw@...len.de>
Cc: netdev@...r.kernel.org
Subject: Re: masquerading AFTER first packet
> > I'm seeing something strange. I'm doing an snmpwalk on an snmp server of
> > mine (behing DNAT) , and after the first response it goes into a timeout.
> > I did a tcpdump and saw this:
> >
> > 1630528031.843264 IP 185.243.112.54.38377 > 37.34.63.177.161: GetNextRequest(23) .1.3.6.1
> > 1630528031.843924 IP 37.34.63.177.161 > 185.243.112.54.38377: GetResponse(34) .1.3.6.1.2=0 <-- ok
> > 1630528031.846950 IP 185.243.112.54.38377 > 37.34.63.177.161: GetNextRequest(24) .1.3.6.1.2
> > 1630528031.847415 IP 192.168.4.2.161 > 185.243.112.54.38377: GetResponse(35) .1.3.6.1.2.1=0 <-- fail
> > 1630528032.847649 IP 185.243.112.54.38377 > 37.34.63.177.161: GetNextRequest(24) .1.3.6.1.2
> > 1630528032.848081 IP 192.168.4.2.161 > 185.243.112.54.38377: GetResponse(35) .1.3.6.1.2.1=0 <-- fail
>
> Looks like a kernel bug, but you did not include kernel version.
I'm definately seeing this with 5.10.0-8-amd64 (debian bullseye on a
vps) and I may have seen it with 4.9.0-5 as well (same system).
> Please also show 'ethtool -k' for in and out interfaces.
Included as attachment.
> You might want to try and disable udp releated offload settings to see
> if that helps.
If offloading is the problem, wouldn't the first packet fail as well?
> e.g.
> c3df39ac9b0e3747bf8233ea9ce4ed5ceb3199d3,
> "udp: ipv4: manipulate network header of NATed UDP GRO fraglist"
> which fixed a bug where only first packet of GRO'd udp train would
> have nat applied, (was broken between 5.6 and 5.11)
Ok I tried "ethtool -K eth* gro off" but it made no difference.
> Other explanation is that conntrack thinks only first packet is valid,
> you can check this for example via
>
> sysctl net.netfilter.nf_conntrack_acct=1
>
> and then checking if 'conntrack -L' shows increasing packet/byte
> counters or is stuck at '1'.
root@...001:~# while true ; do conntrack -L 2>&1 | grep port=161 ; sleep 1 ; done
udp 17 3 src=185.243.112.54 dst=37.34.63.177 sport=40873 dport=161 src=192.168.4.2 dst=185.243.112.54 sport=161 dport=40873 [ASSURED] mark=0 use=1
udp 17 2 src=185.243.112.54 dst=37.34.63.177 sport=40873 dport=161 src=192.168.4.2 dst=185.243.112.54 sport=161 dport=40873 [ASSURED] mark=0 use=1
udp 17 1 src=185.243.112.54 dst=37.34.63.177 sport=40873 dport=161 src=192.168.4.2 dst=185.243.112.54 sport=161 dport=40873 [ASSURED] mark=0 use=1
udp 17 29 src=185.243.112.54 dst=37.34.63.177 sport=52775 dport=161 packets=2 bytes=133 src=192.168.4.2 dst=185.243.112.54 sport=161 dport=52775 packets=1 bytes=80 [ASSURED] mark=0 use=1
udp 17 0 src=185.243.112.54 dst=37.34.63.177 sport=40873 dport=161 src=192.168.4.2 dst=185.243.112.54 sport=161 dport=40873 [ASSURED] mark=0 use=1
udp 17 29 src=185.243.112.54 dst=37.34.63.177 sport=52775 dport=161 packets=3 bytes=200 src=192.168.4.2 dst=185.243.112.54 sport=161 dport=52775 packets=1 bytes=80 [ASSURED] mark=0 use=1
udp 17 119 src=185.243.112.54 dst=37.34.63.177 sport=52775 dport=161 packets=4 bytes=267 src=192.168.4.2 dst=185.243.112.54 sport=161 dport=52775 packets=1 bytes=80 [ASSURED] mark=0 use=1
udp 17 119 src=185.243.112.54 dst=37.34.63.177 sport=52775 dport=161 packets=5 bytes=334 src=192.168.4.2 dst=185.243.112.54 sport=161 dport=52775 packets=1 bytes=80 [ASSURED] mark=0 use=1
udp 17 119 src=185.243.112.54 dst=37.34.63.177 sport=52775 dport=161 packets=6 bytes=401 src=192.168.4.2 dst=185.243.112.54 sport=161 dport=52775 packets=1 bytes=80 [ASSURED] mark=0 use=1
udp 17 119 src=185.243.112.54 dst=37.34.63.177 sport=52775 dport=161 packets=7 bytes=468 src=192.168.4.2 dst=185.243.112.54 sport=161 dport=52775 packets=1 bytes=80 [ASSURED] mark=0 use=1
udp 17 118 src=185.243.112.54 dst=37.34.63.177 sport=52775 dport=161 packets=7 bytes=468 src=192.168.4.2 dst=185.243.112.54 sport=161 dport=52775 packets=1 bytes=80 [ASSURED] mark=0 use=1
...
So yes, increating packet/byte counters.
> If the remaining packets are indeed invalid, try setting
> sysctl net.netfilter.nf_conntrack_log_invalid=17
[Thu Sep 2 19:40:54 2021] DNAT: IN=eth0 OUT= MAC=52:54:00:65:21:38:0c:86:10:b5:91:e0:08:00 SRC=185.243.112.54 DST=37.34.63.177 LEN=66 TOS=0x00 PREC=0x00 TTL=55 ID=1501 DF PROTO=UDP SPT=56523 DPT=161 LEN=46
[Thu Sep 2 19:40:54 2021] FWD: IN=eth0 OUT=eth1 MAC=52:54:00:65:21:38:0c:86:10:b5:91:e0:08:00 SRC=185.243.112.54 DST=192.168.4.2 LEN=66 TOS=0x00 PREC=0x00 TTL=54 ID=1501 DF PROTO=UDP SPT=56523 DPT=161 LEN=46
[Thu Sep 2 19:40:54 2021] FWD: IN=eth0 OUT=eth1 MAC=52:54:00:65:21:38:0c:86:10:b5:91:e0:08:00 SRC=185.243.112.54 DST=192.168.4.2 LEN=67 TOS=0x00 PREC=0x00 TTL=54 ID=1502 DF PROTO=UDP SPT=56523 DPT=161 LEN=47
[Thu Sep 2 19:40:54 2021] nf_ct_proto_17: bad checksum IN= OUT= SRC=192.168.4.2 DST=185.243.112.54 LEN=82 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=161 DPT=56523 LEN=61
[Thu Sep 2 19:40:55 2021] FWD: IN=eth0 OUT=eth1 MAC=52:54:00:65:21:38:0c:86:10:b5:91:e0:08:00 SRC=185.243.112.54 DST=192.168.4.2 LEN=67 TOS=0x00 PREC=0x00 TTL=54 ID=1637 DF PROTO=UDP SPT=56523 DPT=161 LEN=47
[Thu Sep 2 19:40:55 2021] nf_ct_proto_17: bad checksum IN= OUT= SRC=192.168.4.2 DST=185.243.112.54 LEN=82 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=161 DPT=56523 LEN=61
[Thu Sep 2 19:40:56 2021] FWD: IN=eth0 OUT=eth1 MAC=52:54:00:65:21:38:0c:86:10:b5:91:e0:08:00 SRC=185.243.112.54 DST=192.168.4.2 LEN=67 TOS=0x00 PREC=0x00 TTL=54 ID=1700 DF PROTO=UDP SPT=56523 DPT=161 LEN=47
[Thu Sep 2 19:40:56 2021] nf_ct_proto_17: bad checksum IN= OUT= SRC=192.168.4.2 DST=185.243.112.54 LEN=82 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=161 DPT=56523 LEN=61
[Thu Sep 2 19:40:57 2021] FWD: IN=eth0 OUT=eth1 MAC=52:54:00:65:21:38:0c:86:10:b5:91:e0:08:00 SRC=185.243.112.54 DST=192.168.4.2 LEN=67 TOS=0x00 PREC=0x00 TTL=54 ID=1859 DF PROTO=UDP SPT=56523 DPT=161 LEN=47
[Thu Sep 2 19:40:57 2021] nf_ct_proto_17: bad checksum IN= OUT= SRC=192.168.4.2 DST=185.243.112.54 LEN=82 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=161 DPT=56523 LEN=61
[Thu Sep 2 19:40:58 2021] FWD: IN=eth0 OUT=eth1 MAC=52:54:00:65:21:38:0c:86:10:b5:91:e0:08:00 SRC=185.243.112.54 DST=192.168.4.2 LEN=67 TOS=0x00 PREC=0x00 TTL=54 ID=1946 DF PROTO=UDP SPT=56523 DPT=161 LEN=47
[Thu Sep 2 19:40:58 2021] nf_ct_proto_17: bad checksum IN= OUT= SRC=192.168.4.2 DST=185.243.112.54 LEN=82 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=161 DPT=56523 LEN=61
[Thu Sep 2 19:40:59 2021] FWD: IN=eth0 OUT=eth1 MAC=52:54:00:65:21:38:0c:86:10:b5:91:e0:08:00 SRC=185.243.112.54 DST=192.168.4.2 LEN=67 TOS=0x00 PREC=0x00 TTL=54 ID=2131 DF PROTO=UDP SPT=56523 DPT=161 LEN=47
[Thu Sep 2 19:40:59 2021] nf_ct_proto_17: bad checksum IN= OUT= SRC=192.168.4.2 DST=185.243.112.54 LEN=82 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=161 DPT=56523 LEN=61
Those 'bad checksum's are interesting:
root@...001:~# tcpdump -r sep02.pcap -vv | grep 'udp sum ok'
reading from file test.pcap, link-type EN10MB (Ethernet), snapshot length 1600
185.243.112.54.56523 > 192.168.4.2.snmp: [udp sum ok] { SNMPv1 { GetNextRequest(23) R=1216445187 .iso.org.dod.internet } }
192.168.4.2.snmp > 185.243.112.54.56523: [udp sum ok] { SNMPv1 { GetResponse(34) R=1216445187 .iso.org.dod.internet.mgmt=0 } }
185.243.112.54.56523 > 192.168.4.2.snmp: [udp sum ok] { SNMPv1 { GetNextRequest(24) R=1216445188 .iso.org.dod.internet.mgmt } }
192.168.4.2.snmp > 185.243.112.54.56523: [udp sum ok] { SNMPv1 { GetResponse(35) R=1216445188 =0 } }
185.243.112.54.56523 > 192.168.4.2.snmp: [udp sum ok] { SNMPv1 { GetNextRequest(24) R=1216445188 .iso.org.dod.internet.mgmt } }
192.168.4.2.snmp > 185.243.112.54.56523: [udp sum ok] { SNMPv1 { GetResponse(35) R=1216445188 =0 } }
185.243.112.54.56523 > 192.168.4.2.snmp: [udp sum ok] { SNMPv1 { GetNextRequest(24) R=1216445188 .iso.org.dod.internet.mgmt } }
192.168.4.2.snmp > 185.243.112.54.56523: [udp sum ok] { SNMPv1 { GetResponse(35) R=1216445188 =0 } }
185.243.112.54.56523 > 192.168.4.2.snmp: [udp sum ok] { SNMPv1 { GetNextRequest(24) R=1216445188 .iso.org.dod.internet.mgmt } }
192.168.4.2.snmp > 185.243.112.54.56523: [udp sum ok] { SNMPv1 { GetResponse(35) R=1216445188 =0 } }
185.243.112.54.56523 > 192.168.4.2.snmp: [udp sum ok] { SNMPv1 { GetNextRequest(24) R=1216445188 .iso.org.dod.internet.mgmt } }
192.168.4.2.snmp > 185.243.112.54.56523: [udp sum ok] { SNMPv1 { GetResponse(35) R=1216445188 =0 } }
185.243.112.54.56523 > 192.168.4.2.snmp: [udp sum ok] { SNMPv1 { GetNextRequest(24) R=1216445188 .iso.org.dod.internet.mgmt } }
192.168.4.2.snmp > 185.243.112.54.56523: [udp sum ok] { SNMPv1 { GetResponse(35) R=1216445188 =0 } }
As you see, tcpdump says they're fine.
View attachment "eth0.txt" of type "text/plain" (1786 bytes)
View attachment "eth1.txt" of type "text/plain" (1836 bytes)
Download attachment "sep02.pcap" of type "application/vnd.tcpdump.pcap" (1484 bytes)
Powered by blists - more mailing lists