netdev - RE: [PATCH net-next v2 2/5] xen-netback: Add support for IPv6 checksum offload from guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 8 Oct 2013 13:50:27 +0000
From:	Paul Durrant <Paul.Durrant@...rix.com>
To:	Wei Liu <wei.liu2@...rix.com>
CC:	"xen-devel@...ts.xen.org" <xen-devel@...ts.xen.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	Wei Liu <wei.liu2@...rix.com>,
	David Vrabel <david.vrabel@...rix.com>,
	Ian Campbell <Ian.Campbell@...rix.com>
Subject: RE: [PATCH net-next v2 2/5] xen-netback: Add support for IPv6
 checksum offload from guest

> -----Original Message-----
> From: Wei Liu [mailto:wei.liu2@...rix.com]
> Sent: 08 October 2013 14:31
> To: Paul Durrant
> Cc: xen-devel@...ts.xen.org; netdev@...r.kernel.org; Wei Liu; David Vrabel;
> Ian Campbell
> Subject: Re: [PATCH net-next v2 2/5] xen-netback: Add support for IPv6
> checksum offload from guest
> 
> On Tue, Oct 08, 2013 at 11:58:13AM +0100, Paul Durrant wrote:
> [...]
> > -/*
> > - * This is the amount of packet we copy rather than map, so that the
> > - * guest can't fiddle with the contents of the headers while we do
> > - * packet processing on them (netfilter, routing, etc).
> > +/* This is a miniumum size for the linear area to avoid lots of
> > + * calls to __pskb_pull_tail() as we set up checksum offsets.
> >   */
> > -#define PKT_PROT_LEN    (ETH_HLEN + \
> > -			 VLAN_HLEN + \
> > -			 sizeof(struct iphdr) + MAX_IPOPTLEN + \
> > -			 sizeof(struct tcphdr) + MAX_TCP_OPTION_SPACE)
> > +#define PKT_PROT_LEN 128
> >
> 
> Where does 128 come from?
> 

It's just an arbitrary power of 2 that was chosen because it seems to cover most likely v6 headers and all v4 headers.

> >  static u16 frag_get_pending_idx(skb_frag_t *frag)
> >  {
> > @@ -1118,61 +1113,72 @@ static int xenvif_set_skb_gso(struct xenvif *vif,
> >  	return 0;
> >  }
> >
> > -static int checksum_setup(struct xenvif *vif, struct sk_buff *skb)
> > +static inline void maybe_pull_tail(struct sk_buff *skb, unsigned int len)
> >  {
> > -	struct iphdr *iph;
> > +	if (skb_is_nonlinear(skb) && skb_headlen(skb) < len) {
> > +		int target = min_t(int, skb->len, len);
> > +		__pskb_pull_tail(skb, target - skb_headlen(skb));
> > +	}
> > +}
> > +
> > +static int checksum_setup_ip(struct xenvif *vif, struct sk_buff *skb,
> > +			     int recalculate_partial_csum)
> > +{
> [...]
> >
> >  		if (recalculate_partial_csum) {
> >  			struct tcphdr *tcph = tcp_hdr(skb);
> > +
> > +			header_size = skb->network_header +
> > +				off +
> > +				sizeof(struct tcphdr) +
> > +				MAX_TCP_OPTION_SPACE;
> > +			maybe_pull_tail(skb, header_size);
> > +
> 
> I presume this function is checksum_setup stripped down to handle IPv4
> packet. What's the purpose of changing its behaviour? Why the pull_tail
> here?
> 

We have to make sure that the TCP header is in the linear area as we are about to write to the checksum field. In practice, the 128 byte pull should guarantee this but in case that is varied later I wanted to make sure things did not start to fail in an add way.

> > +static int checksum_setup_ipv6(struct xenvif *vif, struct sk_buff *skb,
> > +			       int recalculate_partial_csum)
> > +{
> > +	int err = -EPROTO;
> > +	struct ipv6hdr *ipv6h = (void *)skb->data;
> > +	u8 nexthdr;
> > +	unsigned int header_size;
> > +	unsigned int off;
> > +	bool done;
> > +
> > +	done = false;
> > +	off = sizeof(struct ipv6hdr);
> > +
> > +	header_size = skb->network_header + off;
> > +	maybe_pull_tail(skb, header_size);
> > +
> > +	nexthdr = ipv6h->nexthdr;
> > +
> > +	while ((off <= sizeof(struct ipv6hdr) + ntohs(ipv6h->payload_len))
> &&
> > +	       !done) {
> > +		/* We only access at most the first 2 bytes of any option
> header
> > +		 * to determine the next one.
> > +		 */
> > +		header_size = skb->network_header + off + 2;
> > +		maybe_pull_tail(skb, header_size);
> > +
> 
> Will this cause performance problem? Is it possible that you pull too
> many times?
> 

I guess it means we may get two pulls for the TCP/UDP headers rather than one so could push the pulls into the individual cases if you think it will affect performance that badly.

> > +		switch (nexthdr) {
> > +		case IPPROTO_FRAGMENT: {
> > +			struct frag_hdr *hp = (void *)(skb->data + off);
> > +
> > +			nexthdr = hp->nexthdr;
> > +			off += 8;
> > +			break;
> > +		}
> > +		case IPPROTO_DSTOPTS:
> > +		case IPPROTO_HOPOPTS:
> > +		case IPPROTO_ROUTING: {
> > +			struct ipv6_opt_hdr *hp = (void *)(skb->data + off);
> > +
> > +			nexthdr = hp->nexthdr;
> > +			off += ipv6_optlen(hp);
> > +			break;
> > +		}
> > +		case IPPROTO_AH: {
> > +			struct ip_auth_hdr *hp = (void *)(skb->data + off);
> > +
> > +			nexthdr = hp->nexthdr;
> > +			off += (hp->hdrlen+2)<<2;
> > +			break;
> > +		}
> > +		default:
> > +			done = true;
> > +			break;
> 
> Can you make the logic explicit here?
> 
>    case IPPROTO_TCP:
>    case IPPROTO_UDP:
>         done = true;
> 	break;
>    default:
>         break;
> 
> Another minor suggestion is that change "done" to "found" because you're
> trying to find the two type of headers.
> 

Yes, I could code it that way if you prefer.

> > +	switch (nexthdr) {
> > +	case IPPROTO_TCP:
> > +		if (!skb_partial_csum_set(skb, off,
> > +					  offsetof(struct tcphdr, check)))
> > +			goto out;
> > +
> > +		if (recalculate_partial_csum) {
> > +			struct tcphdr *tcph = tcp_hdr(skb);
> > +
> > +			header_size = skb->network_header +
> > +				off +
> > +				sizeof(struct tcphdr) +
> > +				MAX_TCP_OPTION_SPACE;
> > +			maybe_pull_tail(skb, header_size);
> > +
> 
> Same question as IPv4 counterpart, why do you need to pull here?
> 

Same answer as before ;-)

  Paul
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html