netdev - Re: big ICMP requests get disrupted on IPSec tunnel activation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <EB8510AA7A943D43916A72C9B8F4181F62A096BF@cvk038.intra.cvk.de>
Date:   Wed, 16 Oct 2019 12:57:13 +0000
From:   "Bartschies, Thomas" <Thomas.Bartschies@....de>
To:     'Eric Dumazet' <eric.dumazet@...il.com>,
        'David Ahern' <dsahern@...il.com>,
        "'netdev@...r.kernel.org'" <netdev@...r.kernel.org>
Subject: Re: big ICMP requests get disrupted on IPSec tunnel activation

Hello,

did another test. This time I've changed the order. First triggered the IPSec policy and then tried to ping in parallel with a big packet size.
Could also reproduce the issue, but the trace was completely different. May be this time I've got the trace for the problematic connection?

[Mi Okt 16 14:43:15 2019] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.2.18-200.fc30.x86_64 #1
[Mi Okt 16 14:43:15 2019] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
[Mi Okt 16 14:43:15 2019] RIP: 0010:fq_enqueue+0x531/0x610 [sch_fq]
[Mi Okt 16 14:43:15 2019] Code: c2 48 8b 3d 49 28 00 00 48 89 ee e8 39 0a ec c1 45 85 ed 75 d4 49 8b 2f e9 f1 fb ff ff 48 83 83 30 02 00 00 01 e9 fd fc ff ff <0f> 0b e9 83 fc ff ff 4c 89 ea 31 c0 e9 be fc ff ff 41 8b 46 08 39
[Mi Okt 16 14:43:15 2019] RSP: 0018:ffff948640748ae8 EFLAGS: 00010206
[Mi Okt 16 14:43:15 2019] RAX: 0000000df8475800 RBX: ffff895036e25c00 RCX: 0000000000000018
[Mi Okt 16 14:43:15 2019] RDX: 00000e7ebca89351 RSI: 0014ca96f0789efd RDI: ffffffff83423a60
[Mi Okt 16 14:43:15 2019] RBP: 15ce211ccec6bca9 R08: 000000002a91ed5a R09: 0000000000000001
[Mi Okt 16 14:43:15 2019] R10: ffff948640748a80 R11: 00000000bc3ed800 R12: ffff895034e00900
[Mi Okt 16 14:43:15 2019] R13: ffff89503720f2a0 R14: 00000000000004d3 R15: ffff895038809748
[Mi Okt 16 14:43:15 2019] FS:  0000000000000000(0000) GS:ffff89503bb80000(0000) knlGS:0000000000000000
[Mi Okt 16 14:43:15 2019] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Mi Okt 16 14:43:15 2019] CR2: 00007f3235d5d000 CR3: 00000001334a6003 CR4: 00000000000606e0
[Mi Okt 16 14:43:15 2019] Call Trace:
[Mi Okt 16 14:43:15 2019]  <IRQ>
[Mi Okt 16 14:43:15 2019]  ? __qdisc_run+0x151/0x540
[Mi Okt 16 14:43:15 2019]  __dev_queue_xmit+0x45b/0x960
[Mi Okt 16 14:43:15 2019]  ? mod_timer+0x15b/0x300
[Mi Okt 16 14:43:15 2019]  __neigh_update+0x432/0x9f0
[Mi Okt 16 14:43:15 2019]  arp_process+0x255/0x7e0
[Mi Okt 16 14:43:15 2019]  ? ip_forward+0x3c5/0x480
[Mi Okt 16 14:43:15 2019]  arp_rcv+0x18c/0x1d0
[Mi Okt 16 14:43:15 2019]  __netif_receive_skb_one_core+0x87/0x90
[Mi Okt 16 14:43:15 2019]  netif_receive_skb_internal+0x41/0xb0
[Mi Okt 16 14:43:15 2019]  napi_gro_receive+0xf6/0x160
[Mi Okt 16 14:43:15 2019]  e1000_clean_rx_irq+0x2ac/0x570 [e1000]
[Mi Okt 16 14:43:15 2019]  e1000_clean+0x278/0x5f0 [e1000]
[Mi Okt 16 14:43:15 2019]  net_rx_action+0x148/0x3b0
[Mi Okt 16 14:43:15 2019]  __do_softirq+0xed/0x30e
[Mi Okt 16 14:43:15 2019]  irq_exit+0xf1/0x100
[Mi Okt 16 14:43:15 2019]  do_IRQ+0x81/0xe0
[Mi Okt 16 14:43:15 2019]  common_interrupt+0xf/0xf
[Mi Okt 16 14:43:15 2019]  </IRQ>
[Mi Okt 16 14:43:15 2019] RIP: 0010:native_safe_halt+0xe/0x10
[Mi Okt 16 14:43:15 2019] Code: 90 90 90 90 90 90 90 90 90 90 90 90 e9 07 00 00 00 0f 00 2d 36 06 44 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 26 06 44 00 fb f4 <c3> 90 66 66 66 66 90 41 54 55 53 e8 e2 01 79 ff 65 8b 2d cb 7f 64
[Mi Okt 16 14:43:15 2019] RSP: 0018:ffff9486406bbeb0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd5
[Mi Okt 16 14:43:15 2019] RAX: ffffffff829c7d70 RBX: 0000000000000003 RCX: 0000000000000000
[Mi Okt 16 14:43:15 2019] RDX: 0000000000000003 RSI: 0000000000000003 RDI: ffff89503bb9c500
[Mi Okt 16 14:43:15 2019] RBP: 0000000000000003 R08: 00000066a172847b R09: 0000000000000000
[Mi Okt 16 14:43:15 2019] R10: ffff89503bb97888 R11: 0000000000000000 R12: ffff89503a84be80
[Mi Okt 16 14:43:15 2019] R13: 0000000000000000 R14: 0000000000000000 R15: ffff89503a84be80
[Mi Okt 16 14:43:15 2019]  ? __cpuidle_text_start+0x8/0x8
[Mi Okt 16 14:43:15 2019]  default_idle+0x1a/0x140
[Mi Okt 16 14:43:15 2019]  do_idle+0x1fb/0x260
[Mi Okt 16 14:43:15 2019]  ? do_idle+0x17b/0x260
[Mi Okt 16 14:43:15 2019]  cpu_startup_entry+0x19/0x20
[Mi Okt 16 14:43:15 2019]  start_secondary+0x17d/0x1d0
[Mi Okt 16 14:43:15 2019]  secondary_startup_64+0xa4/0xb0
[Mi Okt 16 14:43:15 2019] ---[ end trace 601c6ff4362427bd ]---

Regards,
--
Thomas Bartschies
CVK IT Systeme

-----Ursprüngliche Nachricht-----
Von: Bartschies, Thomas 
Gesendet: Dienstag, 15. Oktober 2019 12:12
An: 'Eric Dumazet' <eric.dumazet@...il.com>; 'David Ahern' <dsahern@...il.com>; 'netdev@...r.kernel.org' <netdev@...r.kernel.org>
Betreff: AW: AW: big ICMP requests get disrupted on IPSec tunnel activation

Hello Eric,

no problem. Just done that. Here are the results. Hope it helps. Did a parallel debug ftrace also. But no conclusive information within that. Tried it before.

[Di Okt 15 11:51:06 2019] WARNING: CPU: 1 PID: 0 at net/sched/sch_fq.c:389 fq_enqueue+0x531/0x610 [sch_fq]
[Di Okt 15 11:51:06 2019] Modules linked in: twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common camellia_generic camellia_aesni_avx_x86_64 camellia_x86_64 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64 blowfish_common cast5_avx_x86_64 cast5_generic cast_common des_generic cmac xcbc rmd160 sha512_ssse3 sha512_generic af_key xt_CHECKSUM ipt_rpfilter xt_statistic xt_connlimit nf_conncount ip_set_hash_ip br_netfilter bridge stp llc xt_HL xt_hl ipt_REJECT nf_reject_ipv4 xt_REDIRECT xt_recent xt_realm xt_CT xt_NETMAP xt_MASQUERADE xt_esp ipt_ECN xt_ecn xt_comment ipt_CLUSTERIP ipt_ah xt_addrtype act_police cls_basic cls_flow cls_fw cls_u32 sch_tbf sch_prio sch_htb sch_hfsc sch_ingress sch_sfq xt_set ip_set xt_NFLOG nf_log_ipv4 nf_log_common xt_LOG nf_conntrack_sane nf_conntrack_netlink nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_conntrack_tftp
[Di Okt 15 11:51:06 2019]  nf_conntrack_sip nf_conntrack_pptp nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp ts_kmp nf_conntrack_amanda xt_TPROXY nf_tproxy_ipv6 nf_tproxy_ipv4 xt_time xt_TCPMSS xt_tcpmss xt_sctp xt_policy xt_pkttype xt_physdev xt_owner xt_NFQUEUE xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_iprange xt_helper xt_hashlimit xt_DSCP xt_dscp xt_dccp xt_connmark xt_CLASSIFY xt_AUDIT iptable_raw iptable_nat nf_nat iptable_mangle iptable_filter vmw_vsock_vmci_transport vsock ip6t_REJECT nf_reject_ipv6 xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nfnetlink_log nfnetlink ip6table_filter ip6_tables sb_edac crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_rapl_perf joydev vmw_balloon vmw_vmci i2c_piix4 sch_fq tcp_illinois binfmt_misc ip_tables vmwgfx drm_kms_helper ttm drm crc32c_intel serio_raw mptspi e1000 scsi_transport_spi mptscsih mptbase ata_generic pata_acpi
[Di Okt 15 11:51:06 2019] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.2.18-200.fc30.x86_64 #1
[Di Okt 15 11:51:06 2019] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 12/12/2018
[Di Okt 15 11:51:06 2019] RIP: 0010:fq_enqueue+0x531/0x610 [sch_fq]
[Di Okt 15 11:51:06 2019] Code: db 48 8b 3d 49 28 00 00 48 89 ee e8 39 ba 03 db 45 85 ed 75 d4 49 8b 2f e9 f1 fb ff ff 48 83 83 30 02 00 00 01 e9 fd fc ff ff <0f> 0b e9 83 fc ff ff 4c 89 ea 31 c0 e9 be fc ff ff 41 8b 46 08 39
[Di Okt 15 11:51:06 2019] RSP: 0018:ffffa2fe806f0a58 EFLAGS: 00010206
[Di Okt 15 11:51:06 2019] RAX: 0000000df8475800 RBX: ffff973638a32800 RCX: 0000000000000000
[Di Okt 15 11:51:06 2019] RDX: 000003df1924d162 RSI: 0000000000000004 RDI: ffffffffc02845b5
[Di Okt 15 11:51:06 2019] RBP: 15cdc829bb26f866 R08: 0000000000000000 R09: ffff97363afc4e00
[Di Okt 15 11:51:06 2019] R10: 0000000000000078 R11: ffff97363808b088 R12: ffff97363821ce00
[Di Okt 15 11:51:06 2019] R13: ffff973634620cb0 R14: 00000000000003cb R15: ffff973635965c70
[Di Okt 15 11:51:06 2019] FS:  0000000000000000(0000) GS:ffff97363ba80000(0000) knlGS:0000000000000000
[Di Okt 15 11:51:06 2019] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Di Okt 15 11:51:06 2019] CR2: 00005616fa9f90b8 CR3: 0000000135084005 CR4: 00000000000606e0
[Di Okt 15 11:51:06 2019] Call Trace:
[Di Okt 15 11:51:06 2019]  <IRQ>
[Di Okt 15 11:51:06 2019]  ? fq_flow_purge+0x70/0x70 [sch_fq]
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  __dev_queue_xmit+0x45b/0x960
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  ip_finish_output2+0x2c8/0x580
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  ip_do_fragment+0x69d/0x7a0
[Di Okt 15 11:51:06 2019]  ? __ip_flush_pending_frames.isra.0+0x80/0x80
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  ip_output+0x71/0xf0
[Di Okt 15 11:51:06 2019]  ? ip_finish_output2+0x580/0x580
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  ip_forward+0x3c5/0x480
[Di Okt 15 11:51:06 2019]  ? ip_defrag.cold+0x37/0x37
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  ip_rcv+0xbc/0xd0
[Di Okt 15 11:51:06 2019]  ? ip_rcv_finish_core.isra.0+0x390/0x390
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  __netif_receive_skb_one_core+0x80/0x90
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  netif_receive_skb_internal+0x41/0xb0
[Di Okt 15 11:51:06 2019]  ? return_to_handler+0x15/0x36
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  napi_gro_receive+0xf6/0x160
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  e1000_clean_rx_irq+0x2ac/0x570 [e1000]
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  e1000_clean+0x278/0x5f0 [e1000]
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  net_rx_action+0x148/0x3b0
[Di Okt 15 11:51:06 2019]  ? napi_complete_done+0x110/0x110
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  __do_softirq+0xed/0x30e
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  irq_exit+0xf1/0x100
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  do_IRQ+0x81/0xe0
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  common_interrupt+0xf/0xf
[Di Okt 15 11:51:06 2019]  </IRQ>
[Di Okt 15 11:51:06 2019] RIP: 0010:native_safe_halt+0xe/0x10
[Di Okt 15 11:51:06 2019] Code: 90 90 90 90 90 90 90 90 90 90 90 90 e9 07 00 00 00 0f 00 2d 46 06 44 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 36 06 44 00 fb f4 <c3> 90 e8 eb 98 03 00 41 54 55 53 e8 f2 01 79 ff 65 8b 2d db 7f 64
[Di Okt 15 11:51:06 2019] RSP: 0018:ffffa2fe806abeb0 EFLAGS: 00000282 ORIG_RAX: ffffffffffffffd3
[Di Okt 15 11:51:06 2019] RAX: ffffffff9b9c7d60 RBX: 0000000000000001 RCX: 0000000000000000
[Di Okt 15 11:51:06 2019] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff9b1140d6
[Di Okt 15 11:51:06 2019] RBP: 0000000000000001 R08: 0000000000000000 R09: ffff97363afc4e00
[Di Okt 15 11:51:06 2019] R10: 0000000000000c78 R11: ffff973638092c88 R12: ffff97363a84ddc0
[Di Okt 15 11:51:06 2019] R13: 0000000000000000 R14: 0000000000000000 R15: ffff97363a84ddc0
[Di Okt 15 11:51:06 2019]  ? __cpuidle_text_start+0x8/0x8
[Di Okt 15 11:51:06 2019]  ? do_idle+0x1f6/0x260
[Di Okt 15 11:51:06 2019]  default_idle+0x1a/0x140
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  do_idle+0x1fb/0x260
[Di Okt 15 11:51:06 2019]  ? cpu_startup_entry+0x19/0x20
[Di Okt 15 11:51:06 2019]  ? ftrace_graph_caller+0xa0/0xa0
[Di Okt 15 11:51:06 2019]  cpu_startup_entry+0x19/0x20
[Di Okt 15 11:51:06 2019]  start_secondary+0x17d/0x1d0
[Di Okt 15 11:51:06 2019]  secondary_startup_64+0xa4/0xb0
[Di Okt 15 11:51:06 2019] ---[ end trace 34f95424acff5ddb ]---

Regards,
--
Thomas Bartschies
CVK IT Systeme

-----Ursprüngliche Nachricht-----
Von: Eric Dumazet [mailto:eric.dumazet@...il.com] 
Gesendet: Montag, 14. Oktober 2019 17:32
An: Bartschies, Thomas <Thomas.Bartschies@....de>; 'David Ahern' <dsahern@...il.com>; 'netdev@...r.kernel.org' <netdev@...r.kernel.org>
Betreff: Re: AW: big ICMP requests get disrupted on IPSec tunnel activation



On 10/14/19 7:02 AM, Bartschies, Thomas wrote:
> Hello,
> 
> it took a while to build a testsystem for bisecting the issue. Finally I've identified the patch that causes my problems.
> BTW. The fq packet network scheduler is in use.
> 
> It's
> [PATCH net-next] tcp/fq: move back to CLOCK_MONOTONIC
> 
> In the recent TCP/EDT patch series, I switched TCP and sch_fq clocks 
> from MONOTONIC to TAI, in order to meet the choice done earlier for sch_etf packet scheduler.
> 
> But sure enough, this broke some setups were the TAI clock jumps forward (by almost 50 year...), as reported by Leonard Crestez.
> 
> If we want to converge later, we'll probably need to add an skb field to differentiate the clock bases, or a socket option.
> 
> In the meantime, an UDP application will need to use CLOCK_MONOTONIC 
> base for its SCM_TXTIME timestamps if using fq packet scheduler.
> 
> Fixes: 72b0094f9182 ("tcp: switch tcp_clock_ns() to CLOCK_TAI base")
> Fixes: 142537e41923 ("net_sched: sch_fq: switch to CLOCK_TAI")
> Fixes: fd2bca2aa789 ("tcp: switch internal pacing timer to CLOCK_TAI")
> Signed-off-by: Eric Dumazet <edumazet@...xxxxxxx>
> Reported-by: Leonard Crestez <leonard.crestez@...xxxx>
> 
> ----
> 
> After reverting it in a current 5.2.18 kernel, the problem disappears. 
> There were some post fixes for other issues caused by this patch. 
> These fixed other similar issues, but not mine. I've already tried to 
> set the tstamp to zero in xfrm4_output.c, but with no luck so far. I'm pretty sure, that reverting the clock patch isn't the proper solution for upstream. So I what other way this can be fixed?


Thanks a lot Thomas for this report !

I guess you could add a debug check in fq to let us know the call graph.

Something like the following :

diff --git a/net/sched/sch_fq.c b/net/sched/sch_fq.c index 98dd87ce15108cfe1c011da44ba32f97763776c8..2aa41a39e81b94f3b7092dc51b91829f5929634d 100644
--- a/net/sched/sch_fq.c
+++ b/net/sched/sch_fq.c
@@ -380,9 +380,14 @@ static void flow_queue_add(struct fq_flow *flow, struct sk_buff *skb)  {
        struct rb_node **p, *parent;
        struct sk_buff *head, *aux;
+       s64 delay;
 
        fq_skb_cb(skb)->time_to_send = skb->tstamp ?: ktime_get_ns();
 
+       /* We should really add a TCA_FQ_MAX_HORIZON  at some point :( */
+       delay = fq_skb_cb(skb)->time_to_send - ktime_get_ns();
+       WARN_ON_ONCE(delay > 60 * NSEC_PER_SEC);
+
        head = flow->head;
        if (!head ||
            fq_skb_cb(skb)->time_to_send >= fq_skb_cb(flow->tail)->time_to_send) {


> 
> ---
> [PATCH net] net: clear skb->tstamp in bridge forwarding path Matteo 
> reported forwarding issues inside the linux bridge, if the enslaved interfaces use the fq qdisc.
> 
> Similar to commit 8203e2d844d3 ("net: clear skb->tstamp in forwarding 
> paths"), we need to clear the tstamp field in the bridge forwarding path.
> 
> Fixes: 80b14dee2bea ("net: Add a new socket option for a future 
> transmit time.")
> Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
> Reported-and-tested-by: Matteo Croce <mcroce@...hat.com>
> Signed-off-by: Paolo Abeni <pabeni@...hat.com>
> 
> and
> 
> net: clear skb->tstamp in forwarding paths
> 
> Sergey reported that forwarding was no longer working if fq packet scheduler was used.
> 
> This is caused by the recent switch to EDT model, since incoming 
> packets might have been timestamped by __net_timestamp()
> 
> __net_timestamp() uses ktime_get_real(), while fq expects packets using CLOCK_MONOTONIC base.
> 
> The fix is to clear skb->tstamp in forwarding paths.
> 
> Fixes: 80b14dee ("net: Add a new socket option for a future transmit 
> time.")
> Fixes: fb420d5d ("tcp/fq: move back to CLOCK_MONOTONIC")
> Signed-off-by: default avatarEric Dumazet <edumazet@...gle.com>
> Reported-by: default avatarSergey Matyukevich <geomatsi@...il.com>
> Tested-by: default avatarSergey Matyukevich <geomatsi@...il.com>
> Signed-off-by: default avatarDavid S. Miller <davem@...emloft.net>
> 
> Best regards,
> --
> Thomas Bartschies
> CVK IT Systeme
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: Bartschies, Thomas
> Gesendet: Dienstag, 17. September 2019 09:28
> An: 'David Ahern' <dsahern@...il.com>; 'netdev@...r.kernel.org' 
> <netdev@...r.kernel.org>
> Betreff: AW: big ICMP requests get disrupted on IPSec tunnel 
> activation
> 
> Hello,
> 
> thanks for the suggestion. Running pmtu.sh with kernel versions 4.19, 4.20 and even 5.2.13 made no difference. All tests were successful every time.
> 
> Although my external ping tests still failing with the newer kernels. I've ran the script after triggering my problem, to make sure all possible side effects happening. 
> 
> Please keep in mind, that even when the ICMP requests stalling, other connections still going through. Like e.g. ssh or tracepath. I would expect that all connection types would be affected if this is a MTU problem. Am I wrong?
> 
> Any suggestions for more tests to isolate the cause? 
> 
> Best regards,
> --
> Thomas Bartschies
> CVK IT Systeme
> 
> -----Ursprüngliche Nachricht-----
> Von: David Ahern [mailto:dsahern@...il.com]
> Gesendet: Freitag, 13. September 2019 19:13
> An: Bartschies, Thomas <Thomas.Bartschies@....de>; 
> 'netdev@...r.kernel.org' <netdev@...r.kernel.org>
> Betreff: Re: big ICMP requests get disrupted on IPSec tunnel 
> activation
> 
> On 9/13/19 9:59 AM, Bartschies, Thomas wrote:
>> Hello together,
>>
>> since kenel 4.20 we're observing a strange behaviour when sending big ICMP packets. An example is a packet size of 3000 bytes.
>> The packets should be forwarded by a linux gateway (firewall) having multiple interfaces also acting as a vpn gateway.
>>
>> Test steps:
>> 1. Disabled all iptables rules
>> 2. Enabled the VPN IPSec Policies.
>> 3. Start a ping with packet size (e.g. 3000 bytes) from a client in 
>> the DMZ passing the machine targeting another LAN machine 4. Ping 
>> works 5. Enable a VPN policy by sending pings from the gateway to a 
>> tunnel target. System tries to create the tunnel 6. Ping from 3. immediately stalls. No error messages. Just stops.
>> 7. Stop Ping from 3. Start another without packet size parameter. Stalls also.
>>
>> Result:
>> Connections from the client to other services on the LAN machine 
>> still work. Tracepath works. Only ICMP requests do not pass the 
>> gateway anymore. tcpdump sees them on incoming interface, but not on the outgoing LAN interface. IMCP requests to any other target IP address in LAN still work. Until one uses a bigger packet size. Then these alternative connections stall also.
>>
>> Flushing the policy table has no effect. Flushing the conntrack table has no effect. Setting rp_filter to loose (2) has no effect.
>> Flush the route cache has no effect.
>>
>> Only a reboot of the gateway restores normal behavior.
>>
>> What can be the cause? Is this a networking bug?
>>
> 
> some of these most likely will fail due to other reasons, but can you run 'tools/testing/selftests/net/pmtu.sh'[1] on 4.19 and then 4.20 and compare results. Hopefully it will shed some light on the problem and can be used to bisect to a commit that caused the regression.
> 
> 
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tre
> e/tools/testing/selftests/net/pmtu.sh
>