netdev - Re: [PATCH 03/19] netfilter: nf_conntrack_ipv6: improve fragmentation handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.1208291007520.32281@ask.diku.dk>
Date:	Wed, 29 Aug 2012 10:21:41 +0200 (CEST)
From:	Jesper Dangaard Brouer <hawk@...u.dk>
To:	Patrick McHardy <kaber@...sh.net>
Cc:	Pablo Neira Ayuso <pablo@...filter.org>,
	Netfilter Developers <netfilter-devel@...r.kernel.org>,
	netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH 03/19] netfilter: nf_conntrack_ipv6: improve fragmentation
 handling


Signed-off-by: Jesper Dangaard Brouer <brouer@...hat.com>

And some nitpicks below...

On Tue, 28 Aug 2012, Patrick McHardy wrote:

> The IPv6 conntrack fragmentation currently has a couple of shortcomings.
> Fragmentes are collected in PREROUTING/OUTPUT, are defragmented, the
> defragmented packet is then passed to conntrack, the resulting conntrack
> information is attached to each original fragment and the fragments then
> continue their way through the stack.
>
> Helper invocation occurs in the POSTROUTING hook, at which point only
> the original fragments are available. The result of this is that
> fragmented packets are never passed to helpers.
>
> This patch improves the situation in the following way:
>
> - If a reassembled packet belongs to a connection that has a helper
>  assigned, the reassembled packet is passed through the stack instead
>  of the original fragments.
>
> - During defragmentation, the largest received fragment size is stored.
>  On output, the packet is refragmented if required. If the largest
>  received fragment size exceeds the outgoing MTU, a "packet too big"
>  message is generated, thus behaving as if the original fragments
>  were passed through the stack from an outside point of view.
>
> - The ipv6_helper() hook function can't receive fragments anymore for
>  connections using a helper, so it is switched to use ipv6_skip_exthdr()
>  instead of the netfilter specific nf_ct_ipv6_skip_exthdr() and the
>  reassembled packets are passed to connection tracking helpers.
>
> The result of this is that we can properly track fragmented packets, but
> still generate ICMPv6 Packet too big messages if we would have before.
>
> This patch is also required as a precondition for IPv6 NAT, where NAT
> helpers might enlarge packets up to a point that they require
> fragmentation. In that case we can't generate Packet too big messages
> since the proper MTU can't be calculated in all cases (f.i. when
> changing textual representation of a variable amount of addresses),
> so the packet is transparently fragmented iff the original packet or
> fragments would have fit the outgoing MTU.
>
> IPVS parts by Jesper Dangaard Brouer <brouer@...hat.com>.
>
> Signed-off-by: Patrick McHardy <kaber@...sh.net>
> ---
> include/linux/ipv6.h                           |    1 +
> net/ipv6/ip6_output.c                          |    7 +++-
> net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c |   41 ++++++++++++++++++-----
> net/ipv6/netfilter/nf_conntrack_reasm.c        |   19 +++++++++--
> net/netfilter/ipvs/ip_vs_xmit.c                |    9 +++++-
> 5 files changed, 62 insertions(+), 15 deletions(-)
>
> diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
> index 879db26..0b94e91 100644
> --- a/include/linux/ipv6.h
> +++ b/include/linux/ipv6.h
> @@ -256,6 +256,7 @@ struct inet6_skb_parm {
> #if defined(CONFIG_IPV6_MIP6) || defined(CONFIG_IPV6_MIP6_MODULE)
> 	__u16			dsthao;
> #endif
> +	__u16			frag_max_size;
>
> #define IP6SKB_XFRM_TRANSFORMED	1
> #define IP6SKB_FORWARDED	2
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 5b2d63e..a4f6263 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -493,7 +493,8 @@ int ip6_forward(struct sk_buff *skb)
> 	if (mtu < IPV6_MIN_MTU)
> 		mtu = IPV6_MIN_MTU;
>
> -	if (skb->len > mtu && !skb_is_gso(skb)) {
> +	if ((!skb->local_df && skb->len > mtu && !skb_is_gso(skb)) ||

You use (!skb->local_df) to invalidate the use of skb->len, instead of 
(!IP6CB(skb)->frag_max_size), (which is okay, because you set local_df 
later).  Is there are reason this check is better?

> +	    (IP6CB(skb)->frag_max_size && IP6CB(skb)->frag_max_size > mtu)) {

Eric Dumazet would probably nitpick and say, it can be reduced to:
  (IP6CB(skb)->frag_max_size > mtu)
;-)


> 		/* Again, force OUTPUT device used as source address */
> 		skb->dev = dst->dev;
> 		icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
> @@ -636,7 +637,9 @@ int ip6_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *))
> 	/* We must not fragment if the socket is set to force MTU discovery
> 	 * or if the skb it not generated by a local socket.
> 	 */
> -	if (unlikely(!skb->local_df && skb->len > mtu)) {
> +	if (unlikely(!skb->local_df && skb->len > mtu) ||
> +		     (IP6CB(skb)->frag_max_size &&
> +		      IP6CB(skb)->frag_max_size > mtu)) {
> 		if (skb->sk && dst_allfrag(skb_dst(skb)))
> 			sk_nocaps_add(skb->sk, NETIF_F_GSO_MASK);
>
[cut]

> --- a/net/ipv6/netfilter/nf_conntrack_reasm.c
> +++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
> @@ -190,6 +190,7 @@ static int nf_ct_frag6_queue(struct nf_ct_frag6_queue *fq, struct sk_buff *skb,
> 			     const struct frag_hdr *fhdr, int nhoff)
> {
> 	struct sk_buff *prev, *next;
> +	unsigned int payload_len;
> 	int offset, end;
>
> 	if (fq->q.last_in & INET_FRAG_COMPLETE) {
> @@ -197,8 +198,10 @@ static int nf_ct_frag6_queue(struct nf_ct_frag6_queue *fq, struct sk_buff *skb,
> 		goto err;
> 	}
>
> +	payload_len = ntohs(ipv6_hdr(skb)->payload_len);
> +
> 	offset = ntohs(fhdr->frag_off) & ~0x7;
> -	end = offset + (ntohs(ipv6_hdr(skb)->payload_len) -
> +	end = offset + (payload_len -
> 			((u8 *)(fhdr + 1) - (u8 *)(ipv6_hdr(skb) + 1)));
>
> 	if ((unsigned int)end > IPV6_MAXPLEN) {
> @@ -307,6 +310,8 @@ found:
> 	skb->dev = NULL;
> 	fq->q.stamp = skb->tstamp;
> 	fq->q.meat += skb->len;
> +	if (payload_len > fq->q.max_size)
> +		fq->q.max_size = payload_len;
> 	atomic_add(skb->truesize, &nf_init_frags.mem);
>
> 	/* The first fragment.
> @@ -412,10 +417,12 @@ nf_ct_frag6_reasm(struct nf_ct_frag6_queue *fq, struct net_device *dev)
> 	}
> 	atomic_sub(head->truesize, &nf_init_frags.mem);
>
> +	head->local_df = 1;

/me pointing to where local_df is being set.


> 	head->next = NULL;
> 	head->dev = dev;
> 	head->tstamp = fq->q.stamp;
> 	ipv6_hdr(head)->payload_len = htons(payload_len);
> +	IP6CB(head)->frag_max_size = sizeof(struct ipv6hdr) + fq->q.max_size;
>
> 	/* Yes, and fold redundant checksum back. 8) */
> 	if (head->ip_summed == CHECKSUM_COMPLETE)

[cut]
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html