lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1320761153.3444.4.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
Date:	Tue, 08 Nov 2011 15:05:53 +0100
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Chris Siebenmann <cks@...toronto.edu>
Cc:	netdev@...r.kernel.org
Subject: Re: Bug? GRE tunnel periodically won't transmit some packets

Le mardi 08 novembre 2011 à 08:05 -0500, Chris Siebenmann a écrit :
> | Le mardi 08 novembre 2011 à 02:08 -0500, Chris Siebenmann a écrit :
> | >  Let me know if you want a full dump of 'ip link show' (with or
> | > without verbosity).
> | > 
> | > 	- cks
> | 
> | Oh yes, I meant "ip -s -s link show dev extun"
> 
> 7: extun: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1200 qdisc noqueue state UNKNOWN 
>     link/gre 66.96.18.208 peer 128.100.3.58
>     RX: bytes  packets  errors  dropped overrun mcast   
>     3465662    19823    0       0       0       0      
>     RX errors: length  crc     frame   fifo    missed
>                0        0       0       0       0      
>     TX: bytes  packets  errors  dropped carrier collsns 
>     2590152    30918    18      0       0       0      
>     TX errors: aborted fifo    window  heartbeat
>                0        0       0       0      
> 
> Then:
> | You might have packet drops at the ppp0 device itself
> | 
> | ip -s -s link show dev ppp0 ; tc -s -d qdisc show dev ppp0
> 
> ; ip -s -s link show dev ppp0 ; tc -s -d qdisc show dev ppp0
> 5: ppp0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1492 qdisc pfifo_fast state UNKNOWN qlen 3
>     link/ppp 
>     RX: bytes  packets  errors  dropped overrun mcast   
>     1065281803 2474545  0       0       0       0      
>     RX errors: length  crc     frame   fifo    missed
>                0        0       0       0       0      
>     TX: bytes  packets  errors  dropped carrier collsns 
>     1046103545 2887730  0       0       0       0      
>     TX errors: aborted fifo    window  heartbeat
>                0        0       0       0      
> qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>  Sent 1046103557 bytes 2887727 pkt (dropped 20, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0 
> 
>  I have now seen 552-byte mtus listed in 'ip route show table cache'
> output. I can include the full output if you think it's of interest.
> One of the remote IPs is a host we run, so I was able to test ssh to it
> and it did not stall and did not seem to use 'length 500' packets in the
> SSH connection, so this may be a red herring.
> 
> Okay, I just rebooted the machine (into the Fedora 15 kernel) and had
> the problem reproduce. This time ppp0 shows no dropped packets while
> extun counts up errors and I have a 552 mtu route (actually several of
> them) in 'ip route show table cache':
> 
> 6: extun: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1200 qdisc noqueue state UNKNOWN 
>     link/gre 66.96.18.208 peer 128.100.3.58
>     RX: bytes  packets  errors  dropped overrun mcast   
>     119000     653      0       0       0       0      
>     RX errors: length  crc     frame   fifo    missed
>                0        0       0       0       0      
>     TX: bytes  packets  errors  dropped carrier collsns 
>     85848      989      17      0       0       0      
>     TX errors: aborted fifo    window  heartbeat
>                0        0       0       0      
> 
> 4: ppp0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1492 qdisc pfifo_fast state UNKNOWN qlen 3
>     link/ppp 
>     RX: bytes  packets  errors  dropped overrun mcast   
>     2152137    3743     0       0       0       0      
>     RX errors: length  crc     frame   fifo    missed
>                0        0       0       0       0      
>     TX: bytes  packets  errors  dropped carrier collsns 
>     360478     4022     0       0       0       0      
>     TX errors: aborted fifo    window  heartbeat
>                0        0       0       0      
> qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>  Sent 360438 bytes 4018 pkt (dropped 0, overlimits 0 requeues 0) 
>  backlog 0b 0p requeues 0 
> 
> Routes with low mtus for either the GRE gateway target or the host I am
> trying to ssh to:
> 
> 128.100.3.51 dev extun  src 128.100.3.52 
>     cache  expires 401sec ipid 0x941a mtu 552 rtt 27ms rttvar 27ms cwnd 10
> --
> 128.100.3.58 from 128.100.3.52 dev extun 
>     cache  expires 397sec ipid 0xdeda mtu 614 rtt 15ms rttvar 15ms cwnd 10
> --
> 128.100.3.58 from 66.96.18.208 dev ppp0 
>     cache  expires 397sec ipid 0xdeda mtu 614 rtt 15ms rttvar 15ms cwnd 10
> --
> 128.100.3.58 from 66.96.18.208 dev ppp0 
>     cache  expires 397sec ipid 0xdeda mtu 614 rtt 15ms rttvar 15ms cwnd 10
> --
> 128.100.3.51 from 128.100.3.52 dev extun 
>     cache  expires 401sec ipid 0x941a mtu 552 rtt 27ms rttvar 27ms cwnd 10
> 128.100.3.58 dev extun  src 128.100.3.52 
>     cache  expires 397sec ipid 0xdeda mtu 614 rtt 15ms rttvar 15ms cwnd 10
> --
> 128.100.3.51 from 128.100.3.52 tos throughput dev extun 
>     cache  expires 401sec ipid 0x941a mtu 552 rtt 27ms rttvar 27ms cwnd 10
> 128.100.3.51 from 128.100.3.52 dev extun 
>     cache  expires 401sec ipid 0x941a mtu 552 rtt 27ms rttvar 27ms cwnd 10
> 128.100.3.51 from 128.100.3.52 tos lowdelay dev extun 
>     cache  expires 401sec ipid 0x941a mtu 552 rtt 27ms rttvar 27ms cwnd 10
> 
>  With the problem not happening any more, 'ip route' shows only a few
> mtu 552 routes, none to the machines I am doing test ssh's to. The final
> error count was 46 errors on extun (with 0 for all of the specific
> causes) and no errors or dropped packets on ppp0.
> 

So it appears the drop is in gre xmit because frame is bigger than
mtu...

Maybe you receive some strange ICMP (ICMP_FRAG_NEEDED) from a buggy
host ?

You could catch it with "tcpdump -s 1000 -i any icmp" maybe...



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ