lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <83a51e120712181144l65633b32r72cc369f9d012f47@mail.gmail.com>
Date:	Tue, 18 Dec 2007 14:44:44 -0500
From:	"James Nichols" <jamesnichols3@...il.com>
To:	"Eric Dumazet" <dada1@...mosbay.com>
Cc:	"Jan Engelhardt" <jengelh@...putergmbh.de>,
	linux-kernel@...r.kernel.org
Subject: Re: After many hours all outbound connections get stuck in SYN_SENT

> Well... please dont start a flame war :(
>
> Back to your SYN_SENT problem, I suppose the remote IP is known, so you
> probably could post here the result of a tcdpump ?
>
> tcpdump -p -n -s 1600 host IP_of_problematic_peer -c 500
>
> Most probably remote peer received too many attempts from you, and a
> anti DOS mechanism is droping all SYN packets.
>
> Ah well... I remember now that you mentioned tcp_sack setting had an
> effect, so forget the "Most probably..." and give some tcpdump traces :)


I'm not trying to start a flame war.  My situation is that I'm a
performance engineer and I have to restart the app every 38 hours due
to this issue, I'm not the person(s) who wrote it.  It's my job (and
the kernel's) job to support whatever application is being run.  Also,
I was seriously curious to know if there were any better tools for
debugging this in C that I wasn't aware of.  Anwyay...


I've run tcpdump for all IPs during this problem.  I haven't tried
doing it for a single explicit IP address- due to the nature of the
workload it's very difficult to know which IPs will be hit at any
given moment.   What I did see in the full IP captures is that the
returning ACKs don't show up in the packet capture.  Unfortunately,
tcpdump reported that some packets were dropped during the capture.
Is it possible that the kernel was dropping the packets before they
could be captured by tcpdump?

Also, I have some doubts about it being the end points or an
intermediate router, please let me know if these are unreasonable:
1)  We've completely replaced our routing equipment several times in
the past 4 years... totally different colos, router vendors, firewall
vendors, firewall rules, etc.
2)  It occurs across all remote end points at the exact same time.
The endpoints are hetrogenous, run brain-dead OS's that don't do any
DOS detection, reboot at random times of the day, are geographically
distributed, are on different ISPs, etc. etc.
3)  Turning of tcp_sack instantaneously makes the problem go away.  If
it were endpoints or a router, it seems like a stretch that removing a
single TCP option would make the problem instantly resolve itself in
so many places other than the originating host.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ