lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <6.2.5.6.2.20111003112108.03a83a28@binnacle.cx>
Date:	Mon, 03 Oct 2011 11:25:31 -0400
From:	starlight@...nacle.cx
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	linux-kernel@...r.kernel.org, netdev <netdev@...r.kernel.org>,
	Willy Tarreau <w@....eu>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Stephen Hemminger <stephen.hemminger@...tta.com>
Subject: Re: big picture UDP/IP performance question re 2.6.18
  -> 2.6.32

Ran 'nohalt' and 'poll=idle' tests and they
were a complete bust.  'nohalt' gave slightly
better performance with 2.6.18(rhel) and
slightly worse performance with 2.6.39.27.
'poll=idle' improved 2.6.39.27 by 2.7%,
which is nowhere near good enough to have
servers running at full power--about double
idle power according to the UPS.  'poll=idle'
did cause the idle-loop to show up in 'perf'
at the top of the list consuming 45% of the
CPU.  So the one thing it does is permit one
to see exactly how much absolute CPU is
consumed by each component during test runs
rather than providing a relative number.

I've come to the conclusion that Eric is right
and the primary issue is an increase in the
cost of scheduler context switches.  Have
been watching this number and it has held
pretty close to 200k/sec under all scenarios
and kernel versions, so it has to be
a longer code-path, bigger cache pressure
or both in the scheduler.  Sadly this makes
newer kernels a no-go for us.

At 07:47 AM 10/2/2011 -0700, Stephen Hemminger wrote:
>Try disabling PCI DMA remapping. The additional
>overhead of setting up IO mapping can be a
>performance buzz kill. Try CONGIG_DMAR=n

I decided to skip this.  Can't see how it
will make more than a percent or two
difference since the problem is not the
network stack.

At 09:21 AM 10/2/2011 +0200, Eric Dumazet wrote:
>On new kernels, you can check if your udp sockets
>drops frames because of rcvbuffer being full (cat
>/proc/net/udp, check last column 'drops')

Zero, of course.  This info is also
shown lumped in the 'netstat -s'
"packet receive errors" counter.
Have been watching it all along.

>To check if softirq processing hit some limits :
>cat /proc/net/softnet_stat

Zero drops.  Some noise-level squeezing.

Nice stat!  Wish I had known about this
four or five years ago.

>Please send full "dmesg" output 

Attached.


View attachment "dmesg_c6_263904.txt" of type "text/plain" (65130 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ