lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 26 Aug 2008 18:17:31 -0300
From:	Dâniel Fraga <fragabr@...il.com>
To:	"Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
Cc:	David Miller <davem@...emloft.net>, thomas.jarosch@...ra2net.com,
	billfink@...dspring.com, Netdev <netdev@...r.kernel.org>,
	Patrick Hardy <kaber@...sh.net>,
	netfilter-devel@...r.kernel.org, kadlec@...ckhole.kfki.hu
Subject: Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility
 workaround

On Tue, 26 Aug 2008 23:40:58 +0300 (EEST)
"Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi> wrote:

> If you want to, a tcpdump from normal, working case wouldn't hurt either 
> to show the "normal pattern" on network level and that is trivial to 
> produce in no time now that you know the commands etc. I guess... :-)

	Ok, there it is:

http://www.abusar.org/htb/dump-normal.log
	
	Just the port 995... I checked email, then received a message,
checked again, just the normal behaviour.

> They might not be that interested until we have something more concrete 
> than what we know currently... :-)

	Ok :) And you're right, because if I disable frto and htb *and*
the problem has gone, there's a huge chance to be something related to
kernel. Or a mix of kernel and user space problem which happens just
when frto and/or htb are used.

> Can you explain a bit more. Does it resolve during it or some time after 
> it? And more importantly how do you know that it resolves? Ie., what is 
> the normal behavior (be more specific than "it works" :-), how do know 
> that it's working).

	Ok. For example:

1) the connection is normal, then suddenly it stalls. I cannot receive
mail, nor download nntp messages, nor access ftp etc.

2) I do on my client machine a "nmap -sS server" and...

3) ...imediatelly the connection is not stalled anymore.

	Now I remembered one thing and I'd like to make a question (I
hope it isn't a stupid question): dynticks (tickless) were implemented
for x86-64 in 2.6.24 kernel and I started to use dynticks in 2.6.24. Could 
it be affecting the server behaviour? I use dynticks (enabled) on all
my machines, but does it make sense to use in a server environment?
Could the dynticks cause this? Until now, I don't think so, but... who
knows?

http://kernelnewbies.org/Linux_2_6_24#head-4edc562fa1b9fa8e5da5adaf1beab057237c325d

> It seems that either we lack some traffic between the parties or simply 
> need to find out what the userspace is doing, and in the latter case what 
> happens in the network might not be relevant at all. Is there possibility 
> that we miss an alternative route by using the host rule for tcpdump (at 
> the server)? Nmap starts at 22:26:26.613098, the last packet in the client 
> log is at 22:26:01.452842. Alternatively, the port 995 was not the right 
> one to track (though there's clearly this on network level visible problem 
> with it too)... :-(

	I tracked the 995 port, because I have problems reading email
pro pop3s (995). Should I do it different with tcpdump? 

> You might jump into conclusions too quickly every now and then, more
> time might be needed to really ensure something is working. Obviously
> if any non-workingness is noticed, it's always a counter-proof even if 
> long working periods occur in between.

	Ok. It seems a complex issue. You're right. I need more
patience ;)

> In syscall terms this ListenOverflow means that int listen(int sockfd, int 
> backlog); (see man -S 2 listen) is given some size as backlog for those 
> connections that are not yet accept()'ed, and that is exhausted when the 
> ListenOverflow gets incremented (ie., if I'm not completely wrong :-)).

	Hmm interesting.

> You might want to look on dovecot how to make it accept more concurrent 
> connections, perhaps the login_max_processes_count might the right one
> (I quickly glanced http://wiki.dovecot.org/LoginProcess) though this is 
> somewhat site configuration dependant according to that page.

	Yes, I have login_max_processes_count = 128 (the default) and I
have just a few users (just 10 users), so I think it's not the problem.
 
> You could try setting up some script which does something along these 
> lines and then redirect its during the event to some file (+ tcpdumping 
> the thing obviously):
> 
> while [ : ]; do
> 	date "+%s.%N"
> 	cat /proc/net/{netstat,snmp}
> 	sleep 1
> done

	Ok. You're helping a lot. Thanks Ilpo ;)


-- 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists