[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <91bdcedb0903172050td2ef895he48168987ad94472@mail.gmail.com>
Date: Tue, 17 Mar 2009 22:50:21 -0500
From: Dave Boutcher <daveboutcher@...il.com>
To: Eric Dumazet <dada1@...mosbay.com>
Cc: netdev@...r.kernel.org
Subject: Re: IGMP Join dropping multicast packets
On Mon, Mar 16, 2009 at 2:01 PM, Eric Dumazet <dada1@...mosbay.com> wrote:
> Dave Boutcher a écrit :
>> On Sat, Mar 14, 2009 at 9:37 PM, Eric Dumazet <dada1@...mosbay.com> wrote:
>>> Dave Boutcher a écrit :
>>>> I'm running into an interesting problem with joining multiple
>>>> multicast feeds. If you join multiple multicast feeds using
>>>> setsockopt(...,IP_ADD_MEMBERSHIP...) it causes packets on UNRELATED
>>>> multicast feeds to get dropped. We have a multicast feed on a rock
>>>> solid network, and we were very surprised to see dropped packets. The
>>>> cause was a different process/program being run by a different user
>>>> joining a bunch of mulitcast feeds.
>>> I could not reproduce the problem on my machines (bnx2 adapter), even if changing
>>> NUMSOCK from 55 to 200 in joiner.c
>>
>> Thanks for trying Eric. Based on your email I did some more testing
>> and thus far I've
>> only recreated this on x86_64 arches, not on i386. Which arch did you
>> try it on?
>
> I tried both, 32 and 64 bit kernels. No problems so far.
>
> Could you post a linux kernel .config of a non 'working' machine, and dmesg output ?
Eric, based on your inability to recreate this, I tried on some other
hardware I had lying around that has an AMD chipset built-in NIC.
I could not recreate the problem on that hardware. I'm starting to
think this is an e1000 problem. In both the e1000 and e1000e
drivers they do the following logic:
/* clear the old settings from the multicast hash table */
for (i = 0; i < mta_reg_count; i++) {
E1000_WRITE_REG_ARRAY(hw, MTA, i, 0);
E1000_WRITE_FLUSH();
}
/* load any remaining addresses into the hash table */
for (; mc_ptr; mc_ptr = mc_ptr->next) {
hash_value = e1000_hash_mc_addr(hw, mc_ptr->da_addr);
e1000_mta_set(hw, hash_value);
}
There's clearly a window where the NIC doesn't have the multicast
addresses loaded. This may just be broken-as-designed. If anyone
else happens to have some e1000 hardware and wants to see if you
can recreate this, I'd be curious.
Some other notes just FYI...
- RcvbufErrors in /proc/net/snmp doesn't get incremented when this happens
- there are no messages in dmesg
- frames get dropped when the program calls exit() and all the sockets
get closed
(and multicast joins dropped) as well as when the ADD_MEMBERSHIPs happen
- The problem happens even when adding a sleep(1) in between each of the
ADD_MEMBERSHIP calls.
--
Dave B
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists