lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 29 Aug 2012 20:13:18 +0000
From:	"Dave, Tushar N" <tushar.n.dave@...el.com>
To:	Kelvie Wong <kelvie@...e.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"e1000-devel@...ts.sourceforge.net" 
	<e1000-devel@...ts.sourceforge.net>
CC:	Kelvie Wong <kwong@...ldtech.com>
Subject: RE: Intel 82574L hang when sending short ethernet packets at
 100BaseT

>-----Original Message-----
>From: netdev-owner@...r.kernel.org [mailto:netdev-owner@...r.kernel.org]
>On Behalf Of Kelvie Wong
>Sent: Wednesday, August 29, 2012 1:07 PM
>To: netdev@...r.kernel.org; e1000-devel@...ts.sourceforge.net
>Cc: Kelvie Wong
>Subject: Intel 82574L hang when sending short ethernet packets at 100BaseT
>
>Hello all,
>
>We (at Wurldtech) have found a problem with the Intel 82574L controller or
>possibly the e1000e driver whilst trying to send out invalid ethernet
>frames. We do understand that the ethernet frames should be padded anyway,
>and it is our experience that other network cards just pad the ethernet
>frame with nulls rather than hang.
>
>We have found this to be the case on the e1000e driver on kernel version
>3.6-rc3, 3.0-rc4, as well as kernel 2.6.22-14 with the e1000e driver from
>sourceforge versions 1.9.5 and 2.0.0.1.
>
>Anyway, here is the reproduction tool:

I see, "Adding enough zero pad bytes to bring frame size up to 17 bytes causes the reset to no longer occur" - Is this true?

-Tushar

>
>char HELP[] =
>"\n"
>"Intel 82574L Ethernet controllers reset when short eth frames are
>written\n"
>"to them, and the Linux e1000e driver detects and restarts them, during\n"
>"which time link goes down and packet loss occurs.\n"
>"\n"
>"Adding enough zero pad bytes to bring frame size up to 17 bytes causes\n"
>"the reset to no longer occur. The frame addresses and ethtype seems\n"
>"unrelated to the reset\n"
>"\n"
>"This is a reproduction utility. Similar effects can be seen on Windows,
>it\n"
>"is not specific to the driver.\n"
>"\n"
>"When card resets, kernel messages are seen, like:\n"
>"\n"
>"e1000e 0000:02:00.0: eth1: Detected Hardware Unit Hang:\n"
>"  ...\n"
>"e1000e 0000:02:00.0: eth1: Reset adapter\n"
>;
>
>#include <errno.h>
>#include <stdio.h>
>#include <stdlib.h>
>#include <string.h>
>#include <unistd.h>
>#include <arpa/inet.h>
>#include <net/if.h>
>#include <netinet/ether.h>
>#include <netpacket/packet.h>
>#include <sys/ioctl.h>
>#include <sys/socket.h>
>
>static int check(const char* call, int ret) {
>    if(ret < 0) {
>        fprintf(stderr, "%s failed with [%d] %m\n", call, errno);
>        exit(1);
>    }
>    return ret;
>}
>
>#define Check(X) check(#X, X)
>
>int main(int argc, char* argv[])
>{
>    struct ether_header* eth;
>    char* optdev = "eth0";
>    int optcount = 4;
>    int optlen = 14;
>    /* valid eth macs, pulled from random cards */
>    const char* optsrc = "00:a1:b0:00:00:f9";
>    const char* optdst = "20:cf:30:b4:50:87";
>    int opttype = 0;
>    int opt;
>
>    while(-1 != (opt = getopt(argc, argv, "i:c:l:s:d:t:"))) {
>        switch(opt) {
>            case 'i':
>                optdev = optarg;
>                break;
>            case 'c':
>                optcount = atoi(optarg);
>                break;
>            case 'l':
>                optlen = atoi(optarg);
>                break;
>            case 's':
>                optsrc = optarg;
>                break;
>            case 'd':
>                optdst = optarg;
>                break;
>            case 't':
>                opttype = strtol(optarg, NULL, 0);
>                break;
>            default:
>                fprintf(stderr, "usage: %s -i ifx -c count -l len -s
>ethsrc -d ethdst -t ethtype\n%s", argv[0], HELP);
>                return 1;
>        }
>    }
>
>    eth = alloca(optlen > sizeof(*eth) ? optlen : sizeof(*eth));
>
>    memset(eth, 0, optlen);
>
>    memcpy(eth->ether_dhost, ether_aton(optdst), 6);
>    memcpy(eth->ether_shost, ether_aton(optsrc), 6);
>    eth->ether_type = htons(opttype);
>
>    int fd = Check(socket(AF_PACKET,SOCK_RAW,0));
>
>    struct ifreq ifreq;
>
>    memset(&ifreq, 0, sizeof(ifreq));
>    strcpy(ifreq.ifr_name,optdev);
>
>    Check(ioctl(fd,SIOCGIFINDEX,&ifreq));
>
>    struct sockaddr_ll ifaddr = {
>        .sll_ifindex = ifreq.ifr_ifindex,
>        .sll_family = AF_PACKET
>    };
>
>    Check(bind(fd, (struct sockaddr *)&ifaddr, sizeof(ifaddr)));
>
>    printf("dev %s count %d len %d src %s dst %s ethtype %d\n",
>            optdev, optcount, optlen, optsrc, optdst, opttype);
>
>    for(; optcount > 0; optcount--) {
>        Check(send(fd, eth, optlen, 0));
>        /* TODO add a nanosleep() here? */
>    }
>    return 0;
>}
>
>The default settings should be sufficient, given:
>
>1. eth0 is a 82574L
>2. It is connected at 100Mbit
>
>This only works (for whatever reason) when the link rate is set to
>100BaseT; in my most recent tests, I used ethtool to set the card to not
>advertise the Gigabit rate, if that matters.
>
>The relevant dmesg is:
>
>Aug 29 06:53:01 localmachine kernel: ------------[ cut here ]------------
>Aug 29 06:53:01 localmachine kernel: WARNING: at
>net/sched/sch_generic.c:255 dev_watchdog+0x1e9/0x200() Aug 29 06:53:01
>localmachine kernel: Hardware name: MX945GSE Aug 29 06:53:01 localmachine
>kernel: NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out Aug 29
>06:53:01 localmachine kernel: ACPI: Invalid Power Resource to register!
>Aug 29 06:53:01 localmachine kernel: Modules linked in: ipt_REJECT
>xt_state iptable_filter xt_dscp xt_string xt_multiport xt_hashlimit
>xt_mark xt_connmark ip_tables xt_conntrack nf_conntrack_ipv4 nf_conntrack
>nf_defrag_ipv4 coretemp serio_raw rng_core Aug 29 06:53:01 localmachine
>kernel: Pid: 3979, comm: python Not tainted 3.6.0-rc3 #1 Aug 29 06:53:01
>localmachine kernel: Call Trace:
>Aug 29 06:53:01 localmachine kernel:  [<c12f3709>] ?
>dev_watchdog+0x1e9/0x200 Aug 29 06:53:01 localmachine kernel:
>[<c102fccc>] warn_slowpath_common+0x7c/0xa0 Aug 29 06:53:01 localmachine
>kernel:  [<c12f3709>] ? dev_watchdog+0x1e9/0x200 Aug 29 06:53:01
>localmachine kernel:  [<c102fd6e>] warn_slowpath_fmt+0x2e/0x30 Aug 29
>06:53:01 localmachine kernel:  [<c12f3709>] dev_watchdog+0x1e9/0x200 Aug
>29 06:53:01 localmachine kernel:  [<c103b185>]
>run_timer_softirq+0x105/0x1c0 Aug 29 06:53:01 localmachine kernel:
>[<c107c6ae>] ? rcu_process_callbacks+0x27e/0x420 Aug 29 06:53:01
>localmachine kernel:  [<c12f3520>] ? dev_trans_start+0x50/0x50 Aug 29
>06:53:01 localmachine kernel:  [<c1036bb7>] __do_softirq+0x87/0x130 Aug 29
>06:53:01 localmachine kernel:  [<c1036b30>] ? _local_bh_enable+0x10/0x10
>Aug 29 06:53:01 localmachine kernel:  <IRQ>  [<c1036ef2>] ?
>irq_exit+0x32/0x70 Aug 29 06:53:01 localmachine kernel:  [<c1021f79>] ?
>smp_apic_timer_interrupt+0x59/0x90
>Aug 29 06:53:01 localmachine kernel:  [<c137102a>] ?
>apic_timer_interrupt+0x2a/0x30 Aug 29 06:53:01 localmachine kernel: ---[
>end trace a6b1fdf47766ee1f ]--- Aug 29 06:53:01 localmachine kernel:
>e1000e 0000:02:00.0: eth1: Reset adapter Aug 29 06:53:03 localmachine
>kernel: e1000e: eth1 NIC Link is Up 100 Mbps Full Duplex, Flow Control:
>Rx/Tx Aug 29 06:53:03 localmachine kernel: e1000e 0000:02:00.0: eth1:
>10/100 speed: disabling TSO
>
>Relevant lspci -vv (8086:10d3 is the Intel 82574L Gigabit Ethernet
>Controller; pciutils is out of date on this machine)
>
>01:00.0 Class 0200: Device 8086:10d3
>    Subsystem: Device 8086:0000
>    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>Stepping- SERR- FastB2B- DisINTx-
>    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
><TAbort- <MAbort- >SERR- <PERR- INTx-
>    Latency: 0, Cache Line Size: 64 bytes
>    Interrupt: pin A routed to IRQ 16
>    Region 0: Memory at fe9e0000 (32-bit, non-prefetchable) [size=128K]
>    Region 2: I/O ports at dc80 [size=32]
>    Region 3: Memory at fe9dc000 (32-bit, non-prefetchable) [size=16K]
>    Capabilities: [c8] Power Management version 2
>        Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-
>,D3hot+,D3cold+)
>        Status: D0 PME-Enable- DSel=0 DScale=1 PME-
>    Capabilities: [d0] MSI: Mask- 64bit+ Count=1/1 Enable-
>        Address: 0000000000000000  Data: 0000
>    Capabilities: [e0] Express (v1) Endpoint, MSI 00
>        DevCap:    MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns,
>L1 <64us
>            ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>        DevCtl:    Report errors: Correctable+ Non-Fatal+ Fatal+
>Unsupported+
>            RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>            MaxPayload 128 bytes, MaxReadReq 512 bytes
>        DevSta:    CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+
>TransPend-
>        LnkCap:    Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency
>L0 <128ns, L1 <64us
>            ClockPM- Surprise- LLActRep- BwNot-
>        LnkCtl:    ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
>            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>        LnkSta:    Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+
>DLActive- BWMgmt- ABWMgmt-
>    Capabilities: [a0] MSI-X: Enable- Mask- TabSize=3
>        Vector table: BAR=3 offset=00000000
>        PBA: BAR=3 offset=00002000
>    Capabilities: [100] Advanced Error Reporting
>        UESta:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
>MalfTLP- ECRC- UnsupReq- ACSViol-
>        UEMsk:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
>MalfTLP- ECRC- UnsupReq- ACSViol-
>        UESvrt:    DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
>RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>        CESta:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>        CEMsk:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>        AERCap:    First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
>    Kernel driver in use: e1000e
>
>02:00.0 Class 0200: Device 8086:10d3
>    Subsystem: Device 8086:0000
>    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>Stepping- SERR- FastB2B- DisINTx-
>    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
><TAbort- <MAbort- >SERR- <PERR- INTx-
>    Latency: 0, Cache Line Size: 64 bytes
>    Interrupt: pin A routed to IRQ 17
>    Region 0: Memory at feae0000 (32-bit, non-prefetchable) [size=128K]
>    Region 2: I/O ports at ec80 [size=32]
>    Region 3: Memory at feadc000 (32-bit, non-prefetchable) [size=16K]
>    Capabilities: [c8] Power Management version 2
>        Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-
>,D3hot+,D3cold+)
>        Status: D0 PME-Enable- DSel=0 DScale=1 PME-
>    Capabilities: [d0] MSI: Mask- 64bit+ Count=1/1 Enable-
>        Address: 0000000000000000  Data: 0000
>    Capabilities: [e0] Express (v1) Endpoint, MSI 00
>        DevCap:    MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns,
>L1 <64us
>            ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
>        DevCtl:    Report errors: Correctable+ Non-Fatal+ Fatal+
>Unsupported+
>            RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>            MaxPayload 128 bytes, MaxReadReq 512 bytes
>        DevSta:    CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+
>TransPend-
>        LnkCap:    Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency
>L0 <128ns, L1 <64us
>            ClockPM- Surprise- LLActRep- BwNot-
>        LnkCtl:    ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
>            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>        LnkSta:    Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+
>DLActive- BWMgmt- ABWMgmt-
>    Capabilities: [a0] MSI-X: Enable- Mask- TabSize=3
>        Vector table: BAR=3 offset=00000000
>        PBA: BAR=3 offset=00002000
>    Capabilities: [100] Advanced Error Reporting
>        UESta:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
>MalfTLP- ECRC- UnsupReq- ACSViol-
>        UEMsk:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
>MalfTLP- ECRC- UnsupReq- ACSViol-
>        UESvrt:    DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
>RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>        CESta:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>        CEMsk:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>        AERCap:    First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
>    Kernel driver in use: e1000e
>
>And finally:
>
># ethtool -e eth1
>Offset          Values
>------          ------
>0x0000          00 04 5f b0 9a e5 ff ff ff ff 30 00 ff ff ff ff
>0x0010          ff ff ff ff 6b 02 00 00 86 80 d3 10 ff ff d8 80
>0x0020          00 00 01 20 74 7e ff ff 00 00 c8 00 00 00 00 27
>0x0030          c9 6c 50 21 0e 07 03 45 84 2d 40 00 00 f0 07 06
>0x0040          00 60 80 00 04 0f ff 7f 01 4d ec 92 5c fc 83 f0
>0x0050          20 00 83 00 a0 00 1f 7d 61 19 83 01 50 00 ff ff
>0x0060          00 01 00 40 1c 12 07 40 ff ff ff ff ff ff ff ff
>0x0070          ff ff ff ff ff ff ff ff ff ff ff ff ff ff cf 5e
>
>Thank you,
>--
>Kelvie Wong
>
>P.S. I sent this from my personal email because my work email (Cc'd)
>doesn't deal with mailing lists well.
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in the
>body of a message to majordomo@...r.kernel.org More majordomo info at
>http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ