netdev - Re: Packet drops observed @ LINUX_MIB

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <530F6AE8.1030307@hp.com>
Date:	Thu, 27 Feb 2014 08:42:16 -0800
From:	Rick Jones <rick.jones2@...com>
To:	Sharat Masetty <sharat04@...il.com>, netdev@...r.kernel.org
Subject: Re: Packet drops observed @ LINUX_MIB_TCPBACKLOGDROP

On 02/26/2014 06:00 PM, Sharat Masetty wrote:
> Hi,
>
> We are trying to achieve category 4 data rates on an ARM device.

Please forgive my ignorance, but what are "category 4 data rates?"

> We see that with an incoming TCP stream(IP packets coming in and
> acks going out) lots of packets are getting dropped when the backlog
> queue is full. This is impacting overall data TCP throughput. I am
> trying to understand the full context of why this queue is getting
> full so often.
>
> From my brief look at the code, it looks to me like the user space
> process is slow and busy in pulling the data from the socket buffer,
> therefore the TCP stack is using this backlog queue in the mean time.
> This queue is also charged against the main socket buffer allocation.
>
> Can you please explain this backlog queue, and possibly confirm if my
> understanding this  matter is accurate?
> Also can you suggest any ideas on how to mitigate these drops?

Well, there is always the question of why the user process is slow 
pulling the data out of the socket.  If it is unable to handle this 
"category 4 data rate" on a sustained basis, then something has got to 
give.  If it is only *sometimes* unable to keep-up but otherwise is able 
to go as fast and faster (to be able to clear-out a backlog) then you 
could consider tweaking the size of the queue.  But it would be better 
still to find the cause of the occasional slowness and address it.

If you run something which does no processing on the data (eg netperf) 
are you able to achieve the data rates you seek?  At what level of CPU 
utilization?  From a system you know can generate the desired data rate, 
something like:

netperf -H <yourARMsystem> -t TCP_STREAM -C  -- -m <what your 
application sends each time>

If the ARM system is multi-core, I might go with

netperf -H <yourARMsystem> -t TCP_STREAM -C  -- -m <sendsize> -o 
throughput,remote_cpu_util,remote_cpu_peak_util,remote_cpu_peak_id,remote_sd

so netperf will tell you the ID and utilization of the most utilized CPU 
on the receiver in addition to the overall CPU utilization.

There might be other netperf options to use depending on just what the 
sender is doing - to know which would require knowing more about this 
stream of traffic.

happy benchmarking,

rick jones
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html