[<prev] [next>] [day] [month] [year] [list]
Message-ID: <87vd5z5mh1.fsf@small.ssi.corp>
Date: Tue, 21 Sep 2010 14:07:22 +0200
From: arno@...isbad.org (Arnaud Ebalard)
To: Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
Jesse Brandeburg <jesse.brandeburg@...el.com>,
Bruce Allan <bruce.w.allan@...el.com>,
Alex Duyck <alexander.h.duyck@...el.com>,
PJ Waskiewicz <peter.p.waskiewicz.jr@...el.com>,
John Ronciak <john.ronciak@...el.com>
Cc: netdev@...r.kernel.org, Brian Haley <brian.haley@...com>,
Alexey Kuznetsov <kuznet@....inr.ac.ru>,
Stefan Rompf <sux@...lof.de>,
David Miller <davem@...emloft.net>
Subject: [BUG,E1000E] first packets after device is reported up are silently dropped
Hi,
When the link is reported up again (after plugging a cable) by the
driver (E1000E) the first packets sent immediately after that event
are sometimes *silently* dropped by the hardware.
Before describing the tests, here are some info on the hardware and
software. Don't hesitate to ask if you need more:
Kernel: 2.6.35.4
Hardware: Intel 82567LM (rev3) Gigagbit adapter on a DELL E4300
Here is what ethtool reports for the driver:
driver: e1000e
version: 1.0.2-k4
firmware-version: 1.7-7
bus-info: 0000:00:19.0
Switch: tested with a Cisco Catalyst 2960 (100Mbits/s), Planex
FX08-Mini (100Mbit/s), PLanex 5 ports Gigabit
The setup is pretty simple: two different userland tools (umip and
netplug) monitor netlink NEWLINK events and respectively send an ICMPv6
Router Solicitation packet and an IPv4 DHCP request when they receive
the information the interface is UP and RUNNING:
00:21:70:bd:ef:fc > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 62: :: > ff02::2: ICMP6, router solicitation, length 8
00:21:70:bd:ef:fc > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 342: 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from
The idea is to have address autoconfiguration performed as soon as
possible. Because the reemission occurs only after a few seconds, the
net result is a long delay.
I noticed that sometimes the first few packets emitted by the tools are
not answered. I put a tcpdump on the other side. Nothing arrives. I even
checked the led on the switch. Does not blink.
I first thought adding a few ms of delay beteween the reception of the
NEWLINK and the emission of the packets. It seems the higher the better
but at *550ms* I still managed to have the initial packet dropped from
time to time.
I then spent time in the kernel (net/core/dev.c, net/sched/sch_generic.c
drivers/net/e1000e/netdev.c) following the first packet to see where it
gets dropped. I ended up in e1000_xmit_frame() in which everything seems
to be ok. AFAICT, the packet is delivered to the hardware and then
silently killed for some unknown reason.
I added various debug statements in the code (custom printk(),
calls to e1000e_dump()) to try and understand what can be different in
the driver's state when the first packet is deliver and when it is
not. Nothing interesting.
I am currently out of idea. Is this a known bug? What can be happening?
If you have patches you want me to test to get additional info, don't
hesitate!
Cheers,
a+
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists