lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49834F2F.9070500@cosmosbay.com>
Date:	Fri, 30 Jan 2009 20:04:15 +0100
From:	Eric Dumazet <dada1@...mosbay.com>
To:	Kenny Chang <kchang@...enacr.com>
CC:	netdev@...r.kernel.org
Subject: Re: Multicast packet loss

Kenny Chang a écrit :
> Hi all,
> 
> We've been having some issues with multicast packet loss, we were wondering
> if anyone knows anything about the behavior we're seeing.
> 
> Background: we use multicast messaging with lots of messages per sec for
> our
> work. We recently transitioned many of our systems from an Ubuntu Dapper
> Drake
> ia32 distribution to Ubuntu Hardy Heron x86_64. Since the transition, we've
> noticed much more multicast packet loss, and we think it's related to the
> transition. Our particular theory is that it's specifically a 32 vs 64-bit
> issue.
> 
> We narrowed the problem down to the attached program (mcasttest.cc).  Run
> "mcasttest server" on one machine -- it'll send 500,000 messages small
> message
> to a multicast group, 50,000 messages per second.  If we run "mcasttest
> client"
> on another machine, it'll receive all those messages and print a count
> at the
> end of how many messages it sees. It almost never loses any messages.
> However,
> if we run 4 copies of the client on the same machine, receiving the same
> data,
> then the programs usually sees fewer than 500,000 messages. We're
> running with:
> 
> for i in $(seq 1 4); do (./mcasttest client &); done
> 
> We know this because the program prints a count, but dropped packets also
> show up in ifconfig's "RX packets" section.
> 
> Things we're curious about: do other people see similar problems?  The
> tests
> we've done: we've tried this program on a bunch of different machines,
> all of
> which are running either dapper ia32 or hardy x86_64. Uniformly, the dapper
> machines have no problems but on certain machines, Hardy shows
> significant loss. We did some experiments on a troubled machine, varying
> the OS install, including mixed installations where the kernel was
> 64-bit and the userspace was
> 32-bit. This is what we found:
> 
> On machines that exhibit this problem, the ksoftirqd process seems to be
> pegged to 100% CPU when receiving packets.
> 
> Note: while we're on Ubuntu, we've tried this with other distros and
> have seen
> similar results, we just haven't tabulated them.
> 
>> ----------------------------------------------------------------------------
>>
>> userland | userland arch | kernel           | kernel arch |
>> mode          
>> ----------------------------------------------------------------------------
>>
>> Dapper   |            32 | 2.6.15-28-server |          32 | no packet
>> loss
>> Dapper   |            32 | 2.6.22-generic   |          32 | no packet
>> loss Dapper   |            32 | 2.6.22-server    |          32 | no
>> packet loss Hardy    |            32 | 2.6.24-rt        |          32
>> | no packet loss
>> Hardy    |            32 | 2.6.24-generic   |          32 | ~5% packet
>> loss
>> Hardy    |            32 | 2.6.24-server    |          32 | ~10%
>> packet loss
> 
>> Hardy    |            32 | 2.6.22-server    |          64 | no packet
>> loss
>> Hardy    |            32 | 2.6.24-rt        |          64 | no packet
>> loss
>> Hardy    |            32 | 2.6.24-generic   |          64 | 14% packet
>> loss
>> Hardy    |            32 | 2.6.24-server    |          64 | 12% packet
>> loss
> 
>> Hardy    |            64 | 2.6.22-vanilla   |          64 | packet loss
>> Hardy    |            64 | 2.6.24-rt        |          64 | ~5% packet
>> loss
>> Hardy    |            64 | 2.6.24-server    |          64 | ~30%
>> packet loss
>> Hardy    |            64 | 2.6.24-generic   |          64 | ~5% packet
>> loss
>> ----------------------------------------------------------------------------
>>
> 
> It's not exactly clear what exactly the problem is but dapper shows no
> issues regardless of what we try. For hardy, userspace seem to matter:
> 2.6.24-rt kernel shows no packet loss for 32&64bit kernels, as long as
> the userspace is 32-bit.
> 
> Kernel comments:
> 2.6.15-28-server: This is Ubuntu Dapper's stock kernel build.
> 2.6.24-*: This is Ubuntu Hardy's stock kernel.
> 2.6.22-{generic,server}: This is a custom, in-house kernel build, built
> for ia32.
> 2.6.22-vanilla: This is our custom, in-house kernel build, built for
> x86_64.
> 
> We don't think it's related to our custom kernels, because the same
> phenomena
> show up with the Ubuntu stock kernels.
> 
> Hardware:
> 
> The benchmark machine We've been using is an Intel Xeon E5440 @2.83GHz
> dual-cpu quad-core with Broadcom NetXtreme II BCM5708 bnx2 networking.
> 
> We've also tried AMD machines, as well as machines with Tigon3
> partno(BCM95704A6) tg3 network cards, they all show consistent behavior.
> 
> Our hardy x86_64 server machines all appear to have this problem, new
> and old.
> 
> On the other hand, a desktop with Intel Q6600 quad core 2.4GHz and Intel
> 82566DC GigE
> seem to work fine.
> 
> All of the dapper ia32 machines have no trouble, even our older hardware.
> 
>

Hi Kenny

Interesting... You forgot the mcasttest.cc program

Any chance you try a recent kernel (2.6.29-rcX) ?

Could you post "cat /proc/interrupts" results (one for working
 setup, another for non working/droping setup)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ