lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 15 Feb 2022 10:01:22 -0500 From: Willem de Bruijn <willemdebruijn.kernel@...il.com> To: Tao Liu <thomas.liu@...oud.cn> Cc: David Miller <davem@...emloft.net>, Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>, David Ahern <dsahern@...nel.org>, Jakub Kicinski <kuba@...nel.org>, Eric Dumazet <edumazet@...gle.com>, "Samudrala, Sridhar" <sridhar.samudrala@...el.com>, Network Development <netdev@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org> Subject: Re: [PATCH] gso: do not skip outer ip header in case of ipip and net_failover On Mon, Feb 14, 2022 at 8:38 PM Tao Liu <thomas.liu@...oud.cn> wrote: > > Sorry to resend it. > > 2022年2月14日 12:27,Willem de Bruijn <willemdebruijn.kernel@...il.com> 写道: > > On Sun, Feb 13, 2022 at 11:03 PM Tao Liu <thomas.liu@...oud.cn> wrote: > > > Sorry for bothering, just repost it. > > 2022年2月14日 09:28,Willem de Bruijn <willemdebruijn.kernel@...il.com> 写道: > > On Sun, Feb 13, 2022 at 10:10 AM Tao Liu <thomas.liu@...oud.cn> wrote: > > > We encouter a tcp drop issue in our cloud environment. Packet GROed in host > forwards to a VM virtio_net nic with net_failover enabled. VM acts as a > IPVS LB with ipip encapsulation. The full path like: > host gro -> vm virtio_net rx -> net_failover rx -> ipvs fullnat > -> ipip encap -> net_failover tx -> virtio_net tx > > When net_failover transmits a ipip pkt (gso_type = 0x0103), there is no gso > performed because it supports TSO and GSO_IPXIP4. But network_header has > been pointing to inner ip header. > > > If the packet is configured correctly, and net_failover advertises > that it can handle TSO packets with IPIP encap, then still virtio_net > should not advertise it and software GSO be applied on its > dev_queue_xmit call. > > This is assuming that the packet not only has SKB_GSO_IPXIP4 correctly > set, but also tunneling fields like skb->encapsulated and > skb_inner_network_header. > > Thanks very much for your comment! > > Yes, the packet is correct. Another thing i have not pointed directly is > that the pkt has SKB_GSO_DODGY. net_failover do not advertises GSO_ROBUST > but virtio_net do. > > > If net_failover does not advertise NETIF_F_GSO_ROBUST, then > tcp_gso_segment will pass a packet with SKB_GSO_DODGY to the > software gso stack, not taking the branch > > if (skb_gso_ok(skb, features | NETIF_F_GSO_ROBUST)) { > > As i tested, packet with SKB_GSO_DODGY hits this branch. packet's gso_type=0x0103, which > means SKB_GSO_TCPV4, SKB_GSO_DODGY and SKB_GSO_IPXIP4. net_failover matches > the condition. > > Consequently, tcp_gso_segment returns NULL, there is no software gso did here. And > network_header points to inner iph. > > Software gso is did by virtio_net which not advertises NETIF_F_GSO_IPXIP4. It skips the outer > iph, and keeps it unchanged. > > --- > net/ipv4/af_inet.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c > index 9c465ba..f8b3f8a 100644 > --- a/net/ipv4/af_inet.c > +++ b/net/ipv4/af_inet.c > @@ -1425,10 +1425,18 @@ struct sk_buff *inet_gso_segment(struct sk_buff *skb, > static struct sk_buff *ipip_gso_segment(struct sk_buff *skb, > netdev_features_t features) > { > + struct sk_buff *segs; > + int nhoff; > + > if (!(skb_shinfo(skb)->gso_type & SKB_GSO_IPXIP4)) > return ERR_PTR(-EINVAL); > > - return inet_gso_segment(skb, features); > + nhoff = skb_network_header(skb) - skb_mac_header(skb); > + segs = inet_gso_segment(skb, features); > + if (!segs) > + skb->network_header = skb_mac_header(skb) + nhoff - skb->head; > + > + return segs; > } > > > If this would be needed for IPIP, then the same would be needed for SIT, etc. > > Is the skb_network_header > > 1. correctly pointing to the outer header of the TSO packet before the > call to inet_gso_segment > 2. incorrectly pointing to the inner header of the (still) TSO packet > after the call to inet_gso_segment > > inet_gso_segment already does the same operation: save nhoff, pull > network header, call callbacks.gso_segment (which can be > ipip_gso_segment->inet_gso_segment), then place the network header > back at nhoff. > > values print in skb_mac_gso_segment() before callbacks.gso_segment: > ipip: vlan_depth=0 mac_len=0 skb->network_header=206 > net_failover: vlan_depth=14 mac_len=14 skb->network_header=186 > virtio_net: vlan_depth=34 mac_len=34 skb->network_header=206 > > agree to add sit/ip4ip6/ip6ip6, and patch can be simplified as: > > > If IPIP GSO was so broken, I think we would have found it long before. > > As said, inet_gso_segment should already do the right thing for ipip: > it will be called twice. > > > SKB_GSO_DODGY flag and net_failover conduct this issue. local traffic just works fine. Got it. That is an uncommon combination. SKB_GSO_DODGY is set from external virtio_net, which does not support tunnels. But a path with an added tunnel might cause this combination. And inet_gso_segment resets the network header, both times, before calling callbacks.gso_segment() skb_reset_network_header(skb); nhoff = skb_network_header(skb) - skb_mac_header(skb); [...] if (likely(ops && ops->callbacks.gso_segment)) segs = ops->callbacks.gso_segment(skb, features); And resets that after for each skb in segs. skb = segs; do { [...] skb->network_header = (u8 *)iph - skb->head; But does not do this if segs == NULL. The packet has to be restored before it is passed to the device. I think we have to handle this case correctly in inet_gso_segment, instead of patching it up in all the various tunnel devices. The same holds for ipv6_gso_segment.
Powered by blists - more mailing lists