lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 30 Sep 2009 15:06:27 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>
CC:	Gilad Ben-Yossef <gilad@...efidence.com>,
	Netdev <netdev@...r.kernel.org>, Ori Finkalman <ori@...sleep.com>
Subject: Re: [PATCH] [RFC] IPv4 TCP fails to send window scale option when
 window scale is zero

Ilpo Järvinen a écrit :
> On Wed, 30 Sep 2009, Eric Dumazet wrote:
> 
>> Gilad Ben-Yossef a écrit :
>>> Eric Dumazet wrote:
>>>
>>>> Gilad Ben-Yossef a écrit :
>>>>  
>>>>> From: Ori Finkalman <ori@...sleep.com>
>>>>>
>>>>>
>>>>> Acknowledge TCP window scale support by inserting the proper option in
>>>>> SYN/ACK header
>>>>> even if our window scale is zero.
>>>>>
>>>>>
>>>>> This fixes the following observed behavior:
>>>>>
>>>>>
>>>>> 1. Client sends a SYN with TCP window scaling option and non zero window
>>>>> scale value to a Linux box.
>>>>>
>>>>> 2. Linux box notes large receive window from client.
>>>>>
>>>>> 3. Linux decides on a zero value of window scale for its part.
>>>>>
>>>>> 4. Due to compare against requested window scale size option, Linux does
>>>>> not to send windows scale
>>>>>
>>>>> TCP option header on SYN/ACK at all.
>>>>>
>>>>>
>>>>> Result:
>>>>>
>>>>>
>>>>> Client box thinks TCP window scaling is not supported, since SYN/ACK had
>>>>> no TCP window scale option,
>>>>> while Linux thinks that TCP window scaling is supported (and scale might
>>>>> be non zero), since SYN had
>>>>>
>>>>> TCP window scale option and we have a mismatched idea between the client
>>>>> and server regarding window sizes.
>>>>>
>>>>>
>>>>> Please comment and/or apply.
>>>>> ...
>>>>>
>>>>>
>>>>> Signed-off-by: Gilad Ben-Yossef <gilad@...efidence.com>
>>>>> Signed-off-by: Ori Finkelman <ori@...sleep.com>
>>>>>
>>>>>
>>>>> Index: net/ipv4/tcp_output.c
>>>>> ===================================================================
>>>>> --- net/ipv4/tcp_output.c    (revision 46)
>>>>> +++ net/ipv4/tcp_output.c    (revision 210)
>>>>> @@ -353,6 +353,7 @@ static void tcp_init_nondata_skb(struct
>>>>> #define OPTION_SACK_ADVERTISE    (1 << 0)
>>>>> #define OPTION_TS        (1 << 1)
>>>>> #define OPTION_MD5        (1 << 2)
>>>>> +#define OPTION_WSCALE        (1 << 3)
>>>>>
>>>>> struct tcp_out_options {
>>>>>     u8 options;        /* bit field of OPTION_* */
>>>>> @@ -417,7 +418,7 @@ static void tcp_options_write(__be32 *pt
>>>>>                    TCPOLEN_SACK_PERM);
>>>>>     }
>>>>>
>>>>> -    if (unlikely(opts->ws)) {
>>>>> +    if (unlikely(OPTION_WSCALE & opts->options)) {
>>>>>         *ptr++ = htonl((TCPOPT_NOP << 24) |
>>>>>                    (TCPOPT_WINDOW << 16) |
>>>>>                    (TCPOLEN_WINDOW << 8) |
>>>>> @@ -530,8 +531,8 @@ static unsigned tcp_synack_options(struc
>>>>>
>>>>>     if (likely(ireq->wscale_ok)) {
>>>>>         opts->ws = ireq->rcv_wscale;
>>>>> -        if(likely(opts->ws))
>>>>> -            size += TCPOLEN_WSCALE_ALIGNED;
>>>>> +        opts->options |= OPTION_WSCALE;
>>>>> +        size += TCPOLEN_WSCALE_ALIGNED;
>>>>>     }
>>>>>     if (likely(doing_ts)) {
>>>>>         opts->options |= OPTION_TS;
>>>>>
>>>>>
>>>>>
>>>>>     
>>>> Seems not the more logical places to put this logic...
>>>>
>>>> How about this instead ?
>>>>
>>>> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
>>>> index 5200aab..b78c084 100644
>>>> --- a/net/ipv4/tcp_output.c
>>>> +++ b/net/ipv4/tcp_output.c
>>>> @@ -216,6 +216,11 @@ void tcp_select_initial_window(int __space, __u32
>>>> mss,
>>>>              space >>= 1;
>>>>              (*rcv_wscale)++;
>>>>          }
>>>> +        /*
>>>> +         * Set a minimum wscale of 1
>>>> +         */
>>>> +        if (*rcv_wscale == 0)
>>>> +            *rcv_wscale = 1;
>>>>         }
>>>>
>>>>         /* Set initial window to value enough for senders,
>>>>
>>>>   
>>> Thank you for the patch review. The suggested replacement patch
>>> certainly is shorter, code wise, which is an advantage.
>>>
>>> I cant help but feel though, that it is less readable - a window scale
>>> of zero is a perfectly legit value. Adding special logic to rule it out
>>> just because we chose to overload this setting for something else
>>> (whether window scaling is supported or not) seems like an invitation
>>> for someone to get it wrong again down the line, in my opinion.
>> As a matter of fact I didnot test your patch.
>>
>> My reaction was driven by :
>>
>> Your version slows down the tcp_options_write() function, once per tx packet.
> 
> Are you serious that anding would cost that much? :-/

Not really :)

> 
>> tcp_options_write() should not change socket state,
> 
> I fail to see how his patch was changing socket state in anyway in 
> anywhere?

Me too, now you say it :)

> 
>> while
>> tcp_select_initial_window() is the exact place where we are supposed to
>> compute wscale. 
> 
> And it calculated yielding to result of 0, which is perfectly valid. The 
> problem is that tcp_write_options thinks that 0 is indication of no window 
> scaling, instead of the correct interpretation of zero window scaling 
> which makes the huge difference for the opposite direction traffic as 
> these guys have noted. Not that I find your approach that bad either as 
> we only lose 1-bit accuracy for the window which is rather insignificant 
> as 1-byte window increments do not really make that much sense anyway 
> (and we have to specifically code against doing them anyway so the 
> effective granularity is much higher).

Yes, wscale 0 is RFC valid, but are we sure some equipment wont play funny games
with such value ? At least sending "wscale 1-14" must be working...


My quick&dirty patch was only for discussion, I have no strong opinion on it,
only that was on one place to patch instead of two/three/four I dont know yet.

So please Gilad & Ori send us a new patch :)


> 
>> Also how is managed tcp_syn_options() case (for outgoing connections ?)
>>
>>         if (likely(sysctl_tcp_window_scaling)) {
>>                 opts->ws = tp->rx_opt.rcv_wscale;
>>                 if (likely(opts->ws))
>>                         size += TCPOLEN_WSCALE_ALIGNED;
>>         }
>>
>> Dont you need to patch it as well ?
> 
> One certainly should change that too if that patch is the way to go 
> forward.
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ