[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120925213623.39ee67d1@nehalam.linuxnetplumber.net>
Date: Tue, 25 Sep 2012 21:36:23 -0700
From: Stephen Hemminger <shemminger@...tta.com>
To: Jesse Gross <jesse@...ira.com>
Cc: Chris Wright <chrisw@...hat.com>,
David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [PATCHv4 net-next] vxlan: virtual extensible lan
On Tue, 25 Sep 2012 14:55:13 -0700
Jesse Gross <jesse@...ira.com> wrote:
> On Mon, Sep 24, 2012 at 2:50 PM, Stephen Hemminger
> <shemminger@...tta.com> wrote:
> > +static netdev_tx_t vxlan_xmit(struct sk_buff *skb, struct net_device *dev)
> [...]
> > + /* Do PMTU */
> > + if (skb->protocol == htons(ETH_P_IP)) {
> > + df |= old_iph->frag_off & htons(IP_DF);
> > + if (df && mtu < pkt_len) {
> > + icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
> > + htonl(mtu));
> > + ip_rt_put(rt);
> > + goto tx_error;
> > + }
> > + }
> > +#if IS_ENABLED(CONFIG_IPV6)
> > + else if (skb->protocol == htons(ETH_P_IPV6)) {
> > + if (mtu >= IPV6_MIN_MTU && mtu < pkt_len) {
> > + icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
> > + ip_rt_put(rt);
> > + goto tx_error;
> > + }
> > + }
> > +#endif
>
> Won't this black hole packets if we need to generate ICMP messages?
> Since we're doing switching and not routing here icmp_send() doesn't
> necessarily have a route to the relevant endpoint. It looks like
> Ethernet over GRE has this issue as well.
It is an interesting question about what is the correct way to handle packets
where the inner header is IPv6 or IPv4 with Don't Fragment set. As you mention
sending an ICMP response won't work because the tunnel endpoint is not part
of that IP network.
The simple option is to fragment it in the tunnel and since the fragmentation
is not visible to the overlay network, that is okay. But for PMTU discovery
it might be better to just drop the packet and not send a fragmented payload.
Some backbone networks don't allow fragmentation at all (in a futile attempt
to block DoS attacks and protect fragile Windows hosts). Fragmentation
brings all sorts of evil problems like the potential of corrupted assembly
because of sequence wrap; the checksum in the inner packet will defend against
that but tunnels are not supposed to rely on inner protocol data protection.
Or you can just do what Cisco and Microsoft do and just tell everyone
to set larger MTU on the backbone.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists