[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4AC2FA7C.6030901@codefidence.com>
Date: Wed, 30 Sep 2009 08:28:12 +0200
From: Gilad Ben-Yossef <gilad@...efidence.com>
To: Eric Dumazet <eric.dumazet@...il.com>
CC: netdev@...r.kernel.org, Ori Finkalman <ori@...sleep.com>
Subject: Re: [PATCH] [RFC] IPv4 TCP fails to send window scale option when
window scale is zero
Hi,
[ Resending reply due to Android Gmail client sorry state. My apologies
if you got it twice. ]
Eric Dumazet wrote:
> Gilad Ben-Yossef a écrit :
>
>> From: Ori Finkalman <ori@...sleep.com>
>>
>>
>> Acknowledge TCP window scale support by inserting the proper option in
>> SYN/ACK header
>> even if our window scale is zero.
>>
>>
>> This fixes the following observed behavior:
>>
>>
>> 1. Client sends a SYN with TCP window scaling option and non zero window
>> scale value to a Linux box.
>>
>> 2. Linux box notes large receive window from client.
>>
>> 3. Linux decides on a zero value of window scale for its part.
>>
>> 4. Due to compare against requested window scale size option, Linux does
>> not to send windows scale
>>
>> TCP option header on SYN/ACK at all.
>>
>>
>> Result:
>>
>>
>> Client box thinks TCP window scaling is not supported, since SYN/ACK had
>> no TCP window scale option,
>> while Linux thinks that TCP window scaling is supported (and scale might
>> be non zero), since SYN had
>>
>> TCP window scale option and we have a mismatched idea between the client
>> and server regarding window sizes.
>>
>>
>> Please comment and/or apply.
>> ...
>>
>>
>> Signed-off-by: Gilad Ben-Yossef <gilad@...efidence.com>
>> Signed-off-by: Ori Finkelman <ori@...sleep.com>
>>
>>
>> Index: net/ipv4/tcp_output.c
>> ===================================================================
>> --- net/ipv4/tcp_output.c (revision 46)
>> +++ net/ipv4/tcp_output.c (revision 210)
>> @@ -353,6 +353,7 @@ static void tcp_init_nondata_skb(struct
>> #define OPTION_SACK_ADVERTISE (1 << 0)
>> #define OPTION_TS (1 << 1)
>> #define OPTION_MD5 (1 << 2)
>> +#define OPTION_WSCALE (1 << 3)
>>
>> struct tcp_out_options {
>> u8 options; /* bit field of OPTION_* */
>> @@ -417,7 +418,7 @@ static void tcp_options_write(__be32 *pt
>> TCPOLEN_SACK_PERM);
>> }
>>
>> - if (unlikely(opts->ws)) {
>> + if (unlikely(OPTION_WSCALE & opts->options)) {
>> *ptr++ = htonl((TCPOPT_NOP << 24) |
>> (TCPOPT_WINDOW << 16) |
>> (TCPOLEN_WINDOW << 8) |
>> @@ -530,8 +531,8 @@ static unsigned tcp_synack_options(struc
>>
>> if (likely(ireq->wscale_ok)) {
>> opts->ws = ireq->rcv_wscale;
>> - if(likely(opts->ws))
>> - size += TCPOLEN_WSCALE_ALIGNED;
>> + opts->options |= OPTION_WSCALE;
>> + size += TCPOLEN_WSCALE_ALIGNED;
>> }
>> if (likely(doing_ts)) {
>> opts->options |= OPTION_TS;
>>
>>
>>
>>
>
> Seems not the more logical places to put this logic...
>
> How about this instead ?
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 5200aab..b78c084 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -216,6 +216,11 @@ void tcp_select_initial_window(int __space, __u32 mss,
> space >>= 1;
> (*rcv_wscale)++;
> }
> + /*
> + * Set a minimum wscale of 1
> + */
> + if (*rcv_wscale == 0)
> + *rcv_wscale = 1;
> }
>
> /* Set initial window to value enough for senders,
>
>
Thank you for the patch review. The suggested replacement patch
certainly is shorter, code wise, which is an advantage.
I cant help but feel though, that it is less readable - a window scale
of zero is a perfectly legit value. Adding special logic to rule it out
just because we chose to overload this setting for something else
(whether window scaling is supported or not) seems like an invitation
for someone to get it wrong again down the line, in my opinion.
Also note that the suggested fix is in line with how other TCP options
are handled, e.g. TCP timestamp.
Anyone else wants to chime in on that?
PS. I also managed to to get the patch author name spelling wrong. It is
Ori Finkelman and not as written.
Thanks!
Gilad
--
Gilad Ben-Yossef
Chief Coffee Drinker & CTO
Codefidence Ltd.
Web: http://codefidence.com
Cell: +972-52-8260388
Skype: gilad_codefidence
Tel: +972-8-9316883 ext. 201
Fax: +972-8-9316884
Email: gilad@...efidence.com
Check out our Open Source technology and training blog - http://tuxology.net
"Now the world has gone to bed
Darkness won't engulf my head
I can see by infra-red
How I hate the night."
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists