lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1335989484.9611.11.camel@joe2Laptop>
Date:	Wed, 02 May 2012 13:11:24 -0700
From:	Joe Perches <joe@...ches.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	David Miller <davem@...emloft.net>,
	Alexander Duyck <alexander.duyck@...il.com>,
	Alexander Duyck <alexander.h.duyck@...el.com>,
	netdev <netdev@...r.kernel.org>,
	Neal Cardwell <ncardwell@...gle.com>,
	Tom Herbert <therbert@...gle.com>,
	Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
	Michael Chan <mchan@...adcom.com>,
	Matt Carlson <mcarlson@...adcom.com>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	Ben Hutchings <bhutchings@...arflare.com>,
	Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>,
	Maciej Żenczykowski <maze@...gle.com>
Subject: Re: [PATCH net-next] net: implement tcp coalescing in
 tcp_queue_rcv()

On Wed, 2012-05-02 at 21:58 +0200, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@...gle.com>
> 
> Extend tcp coalescing implementing it from tcp_queue_rcv(), the main
> receiver function when application is not blocked in recvmsg().
> 
> Function tcp_queue_rcv() is moved a bit to allow its call from
> tcp_data_queue()
> 
> This gives good results especially if GRO could not kick, and if skb
> head is a fragment.
> 
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Cc: Alexander Duyck <alexander.h.duyck@...el.com>
> Cc: Neal Cardwell <ncardwell@...gle.com>
> Cc: Tom Herbert <therbert@...gle.com>
> ---
> To be applied after "[PATCH v2 net-next] net: take care of cloned skbs
> in tcp_try_coalesce()"
> 
>  include/net/tcp.h    |    3 ++-
>  net/ipv4/tcp.c       |   10 +++++-----
>  net/ipv4/tcp_input.c |   40 +++++++++++++++++++++-------------------
>  3 files changed, 28 insertions(+), 25 deletions(-)
> 
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 0fb84de..a9d2fb8 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -438,7 +438,8 @@ extern int tcp_disconnect(struct sock *sk, int flags);
>  
>  void tcp_connect_init(struct sock *sk);
>  void tcp_finish_connect(struct sock *sk, struct sk_buff *skb);
> -void tcp_queue_rcv(struct sock *sk, struct sk_buff *skb, int hdrlen);
> +int __must_check tcp_queue_rcv(struct sock *sk, struct sk_buff *skb,
> +			       int hdrlen, bool *fragstolen);
>  
>  /* From syncookies.c */
>  extern __u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS];
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 9670af3..bd5deff 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -980,8 +980,8 @@ static inline int select_size(const struct sock *sk, bool sg)
>  static int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size)
>  {
>  	struct sk_buff *skb;
> -	struct tcp_skb_cb *cb;
>  	struct tcphdr *th;
> +	bool fragstolen;

It might be useful to comment that this is/can be initialized to
false in tcp_try_coalesce via tcp_recv_queue.

>  	skb = alloc_skb(size + sizeof(*th), sk->sk_allocation);
>  	if (!skb)
> @@ -994,14 +994,14 @@ static int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size)
>  	if (memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size))
>  		goto err_free;
>  
> -	cb = TCP_SKB_CB(skb);
> -
>  	TCP_SKB_CB(skb)->seq = tcp_sk(sk)->rcv_nxt;
>  	TCP_SKB_CB(skb)->end_seq = TCP_SKB_CB(skb)->seq + size;
>  	TCP_SKB_CB(skb)->ack_seq = tcp_sk(sk)->snd_una - 1;
>  
> -	tcp_queue_rcv(sk, skb, sizeof(*th));
> -
> +	if (tcp_queue_rcv(sk, skb, sizeof(*th), &fragstolen)) {
> +		WARN_ON_ONCE(fragstolen); /* should not happen */

Otherwise this looks like a possibly uninitialized test
of fragstolen.  Maybe there's a path where tail is null
in tcp_recv_queue and it's an uninitialized test anyway.

> +		__kfree_skb(skb);
> +	}
>  	return size;
>  
>  err_free:
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index f891a5e..2233468 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -4662,6 +4662,22 @@ end:
>  		skb_set_owner_r(skb, sk);
>  }
>  
> +int tcp_queue_rcv(struct sock *sk, struct sk_buff *skb, int hdrlen,
> +		  bool *fragstolen)
> +{
> +	int eaten;
> +	struct sk_buff *tail = skb_peek_tail(&sk->sk_receive_queue);
> +
> +	__skb_pull(skb, hdrlen);
> +	eaten = (tail &&
> +		 tcp_try_coalesce(sk, tail, skb, fragstolen)) ? 1 : 0;
> +	tcp_sk(sk)->rcv_nxt = TCP_SKB_CB(skb)->end_seq;
> +	if (!eaten) {
> +		__skb_queue_tail(&sk->sk_receive_queue, skb);
> +		skb_set_owner_r(skb, sk);
> +	}
> +	return eaten;
> +}


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ