[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080826181731.4581fd2c@tux>
Date: Tue, 26 Aug 2008 18:17:31 -0300
From: Dâniel Fraga <fragabr@...il.com>
To: "Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
Cc: David Miller <davem@...emloft.net>, thomas.jarosch@...ra2net.com,
billfink@...dspring.com, Netdev <netdev@...r.kernel.org>,
Patrick Hardy <kaber@...sh.net>,
netfilter-devel@...r.kernel.org, kadlec@...ckhole.kfki.hu
Subject: Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility
workaround
On Tue, 26 Aug 2008 23:40:58 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi> wrote:
> If you want to, a tcpdump from normal, working case wouldn't hurt either
> to show the "normal pattern" on network level and that is trivial to
> produce in no time now that you know the commands etc. I guess... :-)
Ok, there it is:
http://www.abusar.org/htb/dump-normal.log
Just the port 995... I checked email, then received a message,
checked again, just the normal behaviour.
> They might not be that interested until we have something more concrete
> than what we know currently... :-)
Ok :) And you're right, because if I disable frto and htb *and*
the problem has gone, there's a huge chance to be something related to
kernel. Or a mix of kernel and user space problem which happens just
when frto and/or htb are used.
> Can you explain a bit more. Does it resolve during it or some time after
> it? And more importantly how do you know that it resolves? Ie., what is
> the normal behavior (be more specific than "it works" :-), how do know
> that it's working).
Ok. For example:
1) the connection is normal, then suddenly it stalls. I cannot receive
mail, nor download nntp messages, nor access ftp etc.
2) I do on my client machine a "nmap -sS server" and...
3) ...imediatelly the connection is not stalled anymore.
Now I remembered one thing and I'd like to make a question (I
hope it isn't a stupid question): dynticks (tickless) were implemented
for x86-64 in 2.6.24 kernel and I started to use dynticks in 2.6.24. Could
it be affecting the server behaviour? I use dynticks (enabled) on all
my machines, but does it make sense to use in a server environment?
Could the dynticks cause this? Until now, I don't think so, but... who
knows?
http://kernelnewbies.org/Linux_2_6_24#head-4edc562fa1b9fa8e5da5adaf1beab057237c325d
> It seems that either we lack some traffic between the parties or simply
> need to find out what the userspace is doing, and in the latter case what
> happens in the network might not be relevant at all. Is there possibility
> that we miss an alternative route by using the host rule for tcpdump (at
> the server)? Nmap starts at 22:26:26.613098, the last packet in the client
> log is at 22:26:01.452842. Alternatively, the port 995 was not the right
> one to track (though there's clearly this on network level visible problem
> with it too)... :-(
I tracked the 995 port, because I have problems reading email
pro pop3s (995). Should I do it different with tcpdump?
> You might jump into conclusions too quickly every now and then, more
> time might be needed to really ensure something is working. Obviously
> if any non-workingness is noticed, it's always a counter-proof even if
> long working periods occur in between.
Ok. It seems a complex issue. You're right. I need more
patience ;)
> In syscall terms this ListenOverflow means that int listen(int sockfd, int
> backlog); (see man -S 2 listen) is given some size as backlog for those
> connections that are not yet accept()'ed, and that is exhausted when the
> ListenOverflow gets incremented (ie., if I'm not completely wrong :-)).
Hmm interesting.
> You might want to look on dovecot how to make it accept more concurrent
> connections, perhaps the login_max_processes_count might the right one
> (I quickly glanced http://wiki.dovecot.org/LoginProcess) though this is
> somewhat site configuration dependant according to that page.
Yes, I have login_max_processes_count = 128 (the default) and I
have just a few users (just 10 users), so I think it's not the problem.
> You could try setting up some script which does something along these
> lines and then redirect its during the event to some file (+ tcpdumping
> the thing obviously):
>
> while [ : ]; do
> date "+%s.%N"
> cat /proc/net/{netstat,snmp}
> sleep 1
> done
Ok. You're helping a lot. Thanks Ilpo ;)
--
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists