lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <1390700277.27806.72.camel@edumazet-glaptop2.roam.corp.google.com> Date: Sat, 25 Jan 2014 17:37:57 -0800 From: Eric Dumazet <eric.dumazet@...il.com> To: Florian Westphal <fw@...len.de> Cc: netdev@...r.kernel.org Subject: Re: [PATCH 2/2] net: ip, ipv6: handle gso skbs in forwarding path On Sat, 2014-01-25 at 23:48 +0100, Florian Westphal wrote: > Given: > Host <mtu1500> R1 <mtu1200> R2 > > Host sends tcp stream which is routed via R1 and R2. > R1 nics perform GRO. > > In this case, the kernel will fail to send ICMP fragmentation needed > messages (or pkt too big for ipv6), as gso packets currently bypass the > dst mtu checks in forward path. Instead, Linux tries to send out packets > exceeding R1-R2 link mtu. > > When locking route MTU on Host (i.e., no ipv4 DF bit set), R1 does > not fragment the packets when forwarding, and again tries to send out > packets exceeding R1-R2 link mtu. > > This alters the forwarding dstmtu checks to take the individual gso > segment lengths into account. > > For ipv6, we send out pkt too big error for gso if the individual > segments are too big. > > For ipv4, we either send icmp fragmentation needed, or, if the DF bit > is not set, perform software segmentation and let the output path > create fragments when the packet is leaving the machine. This is > technically not 100% accurate, as the error message will contain the > gro-skb tcp header (we'd have to segment unconditionally and use first > segment instead), but it does seem to work. > > Signed-off-by: Florian Westphal <fw@...len.de> > --- > include/linux/skbuff.h | 17 +++++++++++++++ > net/ipv4/ip_forward.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++-- > net/ipv6/ip6_output.c | 18 ++++++++++++++-- > 3 files changed, 89 insertions(+), 4 deletions(-) > > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h > index 3d76066..37cb679 100644 > --- a/include/linux/skbuff.h > +++ b/include/linux/skbuff.h > @@ -2811,5 +2811,22 @@ static inline bool skb_head_is_locked(const struct sk_buff *skb) > { > return !skb->head_frag || skb_cloned(skb); > } > + > +/** > + * skb_gso_network_seglen - Return length of individual segments of a gso packet > + * > + * @skb: GSO skb > + * > + * skb_gso_network_seglen is used to determine the real size of the > + * individual segments, including Layer3 (IP, IPv6) and L4 headers (TCP/UDP). > + * > + * The MAC/L2 header is not accounted for. > + */ > +static inline unsigned int skb_gso_network_seglen(const struct sk_buff *skb) > +{ > + unsigned int hdr_len = skb_transport_header(skb) - > + skb_network_header(skb); > + return hdr_len + skb_gso_transport_seglen(skb); > +} > #endif /* __KERNEL__ */ > #endif /* _LINUX_SKBUFF_H */ > diff --git a/net/ipv4/ip_forward.c b/net/ipv4/ip_forward.c > index 694de3b..78f2e2a 100644 > --- a/net/ipv4/ip_forward.c > +++ b/net/ipv4/ip_forward.c > @@ -39,6 +39,58 @@ > #include <net/route.h> > #include <net/xfrm.h> > > +static bool ip_may_fragment(const struct sk_buff *skb) > +{ > + return unlikely((ip_hdr(skb)->frag_off & htons(IP_DF)) == 0) || > + !skb->local_df; > +} > + > +static bool ip_gso_exceeds_dst_mtu(const struct sk_buff *skb) > +{ > + if (skb->local_df || !skb_is_gso(skb)) > + return false; > + return skb_gso_network_seglen(skb) > dst_mtu(skb_dst(skb)); > +} > + > +static bool ip_exceeds_mtu(const struct sk_buff *skb, unsigned int mtu) > +{ > + unsigned len; > + > + if (skb->local_df) > + return false; > + len = skb_is_gso(skb) ? skb_gso_network_seglen(skb) : skb->len; > + > + return len > mtu; The function should avoid extra computation/tests for small packets. if (skb->len <= mtu || skb->local_df) return false; if (skb_is_gso(skb) && skb_gso_network_seglen(skb) <= mtu) return false; return true; > +} > + > +/* called if GSO skb needs to be fragmented on forward. */ > +static int ip_forward_finish_gso(struct sk_buff *skb) > +{ > + struct sk_buff *segs = skb_gso_segment(skb, 0); 0 is very pessimistic. Have you tried : netdev_features_t features = netif_skb_features(skb); struct sk_buff *segs = skb_gso_segment(skb, features & ~NETIF_F_GSO_MASK); > + int ret = 0; > + > + if (IS_ERR(segs)) { > + kfree_skb(skb); > + return -ENOMEM; > + } > + > + consume_skb(skb); > + > + do { > + struct sk_buff *nskb = segs->next; > + int err; > + > + segs->next = NULL; > + err = dst_output(segs); > + > + if (err && ret == 0) > + ret = err; > + segs = nskb; > + } while (segs); > + > + return ret; > +} > + -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists