[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20150531.215010.362645685156777606.davem@davemloft.net>
Date: Sun, 31 May 2015 21:50:10 -0700 (PDT)
From: David Miller <davem@...emloft.net>
To: daniel@...earbox.net
Cc: ncardwell@...gle.com, netdev@...r.kernel.org, fw@...len.de,
glenn.judd@...ganstanley.com, stephen@...workplumber.org,
edumazet@...gle.com, ycheng@...gle.com
Subject: Re: [PATCH net] tcp: fix child sockets to use system default
congestion control if not set
From: Daniel Borkmann <daniel@...earbox.net>
Date: Fri, 29 May 2015 20:24:25 +0200
> On 05/29/2015 07:47 PM, Neal Cardwell wrote:
>> Linux 3.17 and earlier are explicitly engineered so that if the app
>> doesn't specifically request a CC module on a listener before the SYN
>> arrives, then the child gets the system default CC when the connection
>> is established. See tcp_init_congestion_control() in 3.17 or earlier,
>> which says "if no choice made yet assign the current value set as
>> default". The change ("net: tcp: assign tcp cong_ops when tcp sk is
>> created") altered these semantics, so that children got their parent
>> listener's congestion control even if the system default had changed
>> after the listener was created.
>>
>> This commit returns to those original semantics from 3.17 and earlier,
>> since they are the original semantics from 2007 in 4d4d3d1e8 ("[TCP]:
>> Congestion control initialization."), and some Linux congestion
>> control workflows depend on that.
>>
>> In summary, if a listener socket specifically sets TCP_CONGESTION to
>> "x", or the route locks the CC module to "x", then the child gets
>> "x". Otherwise the child gets current system default from
>> net.ipv4.tcp_congestion_control. That's the behavior in 3.17 and
>> earlier, and this commit restores that.
>>
>> Fixes: 55d8694fa82c ("net: tcp: assign tcp cong_ops when tcp sk is
>> created")
>> Cc: Florian Westphal <fw@...len.de>
>> Cc: Daniel Borkmann <dborkman@...hat.com>
>> Cc: Glenn Judd <glenn.judd@...ganstanley.com>
>> Cc: Stephen Hemminger <stephen@...workplumber.org>
>> Signed-off-by: Neal Cardwell <ncardwell@...gle.com>
>> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
>> Signed-off-by: Yuchung Cheng <ycheng@...gle.com>
>
> Ok, change looks good to me, thanks.
>
> Acked-by: Daniel Borkmann <daniel@...earbox.net>
Applied and queued up for -stable, thanks!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists