lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 7 Sep 2009 07:21:43 +0000
From:	Jarek Poplawski <jarkao2@...il.com>
To:	Holger Hoffstaette <holger.hoffstaette@...glemail.com>
Cc:	netdev@...r.kernel.org, Eric Dumazet <eric.dumazet@...il.com>
Subject: Re: Network hangs with 2.6.30.5

On 03-09-2009 21:55, Holger Hoffstaette wrote:
> On Thu, 03 Sep 2009 21:27:08 +0200, Eric Dumazet wrote:
> 
>> Holger Hoffstaette a écrit :
>>> Problem found! At least for me..
>>>
>>>> On 01-09-2009 17:32, Holger Hoffstaette wrote:
>>>>> On Tue, 01 Sep 2009 16:17:08 +0200, Holger Hoffstaette wrote:
>>>>>
>>>>> [network regressions in .30]
>>> I got the git .30.y stable tree and reverted various e1000 commits that
>>> seemed to coincide with the various .30-rc releases but nothing helped.
>>> Also no relation to offloads etc.
>>>
>>> However I did notice that the "stuck squid" problem seemed to magically
>>> fix itself after a few seconds - then hang again, fix itself after
>>> timeouts etc. So I suspected something TCP related and BINGO!
>>>
>>> Turns out I had both tcp_tw_recycle and tcp_tw_reuse set to 1 for
>>> reasons I don't want to explain. :)
>>>
>>> I can now arbitrarily fix the hanging behaviour by setting
>>> tcp_tw_recycle to 0, and cause hangs by setting it to 1 again. For
>>> obvious reasons this seems to affect squid more than other tasks with
>>> more long-lived connections. What is the right behaviour? beats me.
>>>
>>> tcp_tw_reuse does not appear to play a role, so the real culprit at
>>> least in my case seems to be tcp_tw_recycle. In previous releases this
>>> (and tw_reuse) was necessary for various server tasks.
>>>
>>> Nevertheless, something has changed between .29 and .30 that "broke" the
>>> previous behaviour. Whether this is progress or an regression I cannot
>>> say. Maybe someone else has an idea?
>>>
>>>
>> Well... not yet :)
>>
>> We probably can reproduce this problem with any NIC...
>>
>> Could you send from the 'buggy' setup
>>
>> $ grep . /proc/sys/net/ipv4/*
> 
> Sure:
...
> Was that somewhat helpful? I can certainly create a full trace but that's
> going to be big.

Congratulations for finding the culprit!

While Eric is analyzing your data, I guess you could try reverting
some stuff around this tcp_tw_recycle, and my tcp ignorance would
point these commits for the beginning:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.30.y.git;a=commitdiff;h=fc1ad92dfc4e363a055053746552cdb445ba5c57
http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.30.y.git;a=commitdiff;h=c887e6d2d9aee56ee7c9f2af4cec3a5efdcc4c72

Regards,
Jarek P.

PS: you don't have to remove anybody from the Cc line on this list.;-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists