lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:  <pan.2009.09.03.19.20.44.736875@googlemail.com>
Date:	Thu, 03 Sep 2009 21:20:44 +0200
From:	"Holger Hoffstaette" <holger.hoffstaette@...glemail.com>
To:	netdev@...r.kernel.org
Subject:  Re: Network hangs with 2.6.30.5


Problem found! At least for me..

On Thu, 03 Sep 2009 07:46:10 +0000, Jarek Poplawski wrote:

> On 01-09-2009 17:32, Holger Hoffstaette wrote:
>> On Tue, 01 Sep 2009 16:17:08 +0200, Holger Hoffstaette wrote:
>> 
>> [network regressions in .30]
>> 
>>> I do have an older Intel Gbit card identified thusly: 00:0b.0 Ethernet
>>> controller: Intel Corporation 82545GM Gigabit Ethernet Controller (rev
>>> 04)
>>>
>>> and enabled all sorts of offloading:
>>>
>>> $ethtool -k eth0
>>> Offload parameters for eth0:
>>> rx-checksumming: on
>>> tx-checksumming: on
>>> scatter-gather: on
>>> tcp segmentation offload: on
>>> udp fragmentation offload: off
>>> generic segmentation offload: on
>>>
>>> Maybe that is the culprit, as Eric Dumazet suspected in his mail..I
>>> will try the latest .30 stable again without that, but in any case
>>> something is indeed very broken in there.
>> 
>> So I just tried .30.5 again. Indeed the offloading seems to play a role:
>> with everything enabled I cannot even reliably ssh into the machine
>> (only "sometimes"?); however without any offloading things get "a bit
>> better" and squid even serves up some pages..for a while. Then it seems
>> to hang, swallow requests or not finish them. The tested sites reliably
>> work for the Windows client when it bypasses squid, as does DNS (also
>> served from the box). It *seems* to affect incoming traffic more than
>> outgoing - e.g. mail or news polling seemed to kick off and finish just
>> fine. Rebooting back into .29 fixes everything. Last time I tried
>> .31rc-something (4 IIRC) it exhibited the same problems.
>> 
>> I'm open to suggestions and willing to help fix this but need this
>> machine for actual work. :/
> 
> It seems, you and Clifford, use e1000 so it would be interesting to find
> out if it matters. Does your friend with working .30 use another card? If
> you can't try with another NIC, we could probably try to revert most of
> the driver's changes after .29 (except maybe 3) to check this driver only.
> 
> Clifford, if it still doesn't work for you, could you try 2.6.29?

I got the git .30.y stable tree and reverted various e1000 commits that
seemed to coincide with the various .30-rc releases but nothing helped.
Also no relation to offloads etc.

However I did notice that the "stuck squid" problem seemed to magically
fix itself after a few seconds - then hang again, fix itself after
timeouts etc. So I suspected something TCP related and BINGO!

Turns out I had both tcp_tw_recycle and tcp_tw_reuse set to 1 for reasons
I don't want to explain. :)

I can now arbitrarily fix the hanging behaviour by setting
tcp_tw_recycle to 0, and cause hangs by setting it to 1 again. For obvious
reasons this seems to affect squid more than other tasks with more long-lived
connections. What is the right behaviour? beats me.

tcp_tw_reuse does not appear to play a role, so the real culprit at least
in my case seems to be tcp_tw_recycle. In previous releases this (and
tw_reuse) was necessary for various server tasks.

Nevertheless, something has changed between .29 and .30 that "broke" the
previous behaviour. Whether this is progress or an regression I cannot
say. Maybe someone else has an idea?

Holger


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ