netdev - RE: [PATCH 0/2] Tracepoint for tcp retransmission

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <65795E11DBF1E645A09CEC7EAEE94B9CB5AFAF34@USINDEVS02.corp.hds.com>
Date:	Tue, 20 Dec 2011 13:13:01 -0500
From:	Satoru Moriya <satoru.moriya@....com>
To:	Stephen Hemminger <stephen.hemminger@...tta.com>
CC:	"nhorman@...driver.com" <nhorman@...driver.com>,
	"davem@...emloft.net" <davem@...emloft.net>,
	"tgraf@...radead.org" <tgraf@...radead.org>,
	Seiji Aguchi <seiji.aguchi@....com>,
	"dle-develop@...ts.sourceforge.net" 
	<dle-develop@...ts.sourceforge.net>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH 0/2] Tracepoint for tcp retransmission

On 12/16/2011 07:17 PM, Stephen Hemminger wrote:
> 
>> Sometimes network packets are dropped for some reason. In enterprise 
>> systems which require strict RAS functionality, we must know the 
>> reason why it happened and explain it to our customers even if using 
>> TCP. When we investigate the incidents, at first we try to find out 
>> whether the problem is in the server(kernel, application) or else 
>> (router, hub etc). And next we try to find out which layer
>> (application/middleware/kernel(IP/TCP/UDP/..)etc.) the problem 
>> occurs.
> 
> I feel sorry for you, your users don't understand TCP. TCP 
> intentionally induces loss to measure capacity. This is one of the 
> fundamental principles of loss based congestion control.

Maybe my explanation was not enough, I think...

Yes, as you said above, TCP induces loss in principle. Actually,
customers doesn't think that is a problem. But, at the same time
packet drop occurs everywhere in network due to BUG, wrong
configuration, broken hardware and/or etc. and sometimes it
causes serious problems to customers' system.

We provide Linux support service in our business and when a
serious problem occurs, we collect and analyze logs and find
where the problem is.

With this tracepoint, we're able to know whether the problem
is in OS or somewhere else. (Negative return value means packet
drop happened inside OS.)
That's a great help for us narrow down the root cause of
the problem.

I'll rewrite the cover letter when I post v2.

Regards,
Satoru