lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 19 Feb 2007 17:18:45 -0500
From:	lsorense@...lub.uwaterloo.ca (Lennart Sorensen)
To:	pcnet32@...izon.net
Cc:	netdev@...r.kernel.org
Subject: Re: Re: Strange connection slowdown on pcnet32

On Mon, Feb 19, 2007 at 03:11:36PM -0500, Lennart Sorensen wrote:
> I have been poking at things with firescope to see if the MAC is
> actually writing to system memory or not.
> 
> The entry that it gets stuch on is _always_ entry 0 in the rx_ring.
> There does not appear to be any exceptions to this.  
> 
> Here is my firescope (slightly modified for this purpose) dump of the
> rx_ring of eth1:
> 
> Descriptor:Address: /--base---\ /buf\ /sta\ /-message-\ /reserved-\
>           :       : |         | |len| |tus| | length  | |         |
> RXdesc[00]:6694000: 12 18 5f 05 fa f9 00 80 40 00 00 00 00 00 00 00
> RXdesc[01]:6694010: 12 78 15 05 fa f9 40 03 ee 05 00 00 00 00 00 00
> RXdesc[02]:6694020: 12 a0 52 06 fa f9 40 03 ee 05 00 00 00 00 00 00
> RXdesc[03]:6694030: 12 f8 c2 04 fa f9 40 03 ee 05 00 00 00 00 00 00
> RXdesc[04]:6694040: 12 70 15 05 fa f9 40 03 ee 05 00 00 00 00 00 00
> RXdesc[05]:6694050: 12 e8 37 05 fa f9 40 03 ee 05 00 00 00 00 00 00
> RXdesc[06]:6694060: 12 e0 37 05 fa f9 40 03 ee 05 00 00 00 00 00 00
> RXdesc[07]:6694070: 12 e8 d5 04 fa f9 40 03 ee 05 00 00 00 00 00 00
> RXdesc[08]:6694080: 12 e0 d5 04 fa f9 40 03 ee 05 00 00 00 00 00 00
> RXdesc[09]:6694090: 12 d8 d1 05 fa f9 40 03 46 00 00 00 00 00 00 00
> RXdesc[10]:66940a0: 12 d0 d1 05 fa f9 40 03 4e 00 00 00 00 00 00 00
> RXdesc[11]:66940b0: 12 d8 02 05 fa f9 10 03 40 00 00 00 00 00 00 00
> RXdesc[12]:66940c0: 12 d0 02 05 fa f9 40 03 46 00 00 00 00 00 00 00
> RXdesc[13]:66940d0: 12 38 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00
> RXdesc[14]:66940e0: 12 30 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00
> RXdesc[15]:66940f0: 12 78 2c 05 fa f9 00 80 ee 05 00 00 00 00 00 00
> RXdesc[16]:6694100: 12 a0 58 05 fa f9 00 80 ee 05 00 00 00 00 00 00
> RXdesc[17]:6694110: 12 b0 04 05 fa f9 00 80 ee 05 00 00 00 00 00 00
> RXdesc[18]:6694120: 12 b8 04 05 fa f9 00 80 ee 05 00 00 00 00 00 00
> RXdesc[19]:6694130: 12 70 2c 05 fa f9 00 80 ee 05 00 00 00 00 00 00
> RXdesc[20]:6694140: 12 f8 56 05 fa f9 00 80 ee 05 00 00 00 00 00 00
> RXdesc[21]:6694150: 12 c8 29 05 fa f9 00 80 ee 05 00 00 00 00 00 00
> RXdesc[22]:6694160: 12 20 03 05 fa f9 00 80 ee 05 00 00 00 00 00 00
> RXdesc[23]:6694170: 12 60 4c 05 fa f9 00 80 87 05 00 00 00 00 00 00
> RXdesc[24]:6694180: 12 98 53 05 fa f9 00 80 40 00 00 00 00 00 00 00
> RXdesc[25]:6694190: 12 b0 cc 04 fa f9 00 80 40 00 00 00 00 00 00 00
> RXdesc[26]:66941a0: 12 a8 3f 05 fa f9 00 80 40 00 00 00 00 00 00 00
> RXdesc[27]:66941b0: 12 58 e8 04 fa f9 00 80 40 00 00 00 00 00 00 00
> RXdesc[28]:66941c0: 12 b0 4d 06 fa f9 00 80 40 00 00 00 00 00 00 00
> RXdesc[29]:66941d0: 12 38 ef 04 fa f9 00 80 40 00 00 00 00 00 00 00
> RXdesc[30]:66941e0: 12 98 1f 05 fa f9 00 80 40 00 00 00 00 00 00 00
> RXdesc[31]:66941f0: 12 28 f1 04 fa f9 00 80 40 00 00 00 00 00 00 00
> 
> I only ever see entry 0 as status 0080 (0x8000 which is owned by mac),
> and this is while the driver is checking entry 0 every time it tries to
> check for any waiting packets.
> 
> Running tcpdump while pinging gives the interesting result that some
> packets are ariving out of order making it seem like the driver is
> processing the packets out of order.  Perhaps the driver is wrong to be
> looking at entry 0, and should be looking at entry 1 and is hence stuck
> until the whole receive ring has been filled again?
> 
> 15:06:04.112812 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 1
> 15:06:05.119799 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 2
> 15:06:05.120159 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 2
> 15:06:05.127045 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 1
> 15:06:06.119862 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 3
> 15:06:07.119921 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 4
> 15:06:08.119994 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 5
> 15:06:08.426400 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 3
> 15:06:08.427915 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 4
> 15:06:08.429033 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 5
> 15:06:09.120053 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 6
> 15:06:10.120109 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 7
> 15:06:10.705332 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 6
> 15:06:10.707258 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 7
> 15:06:11.120175 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 8
> 15:06:12.120233 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 9
> 15:06:13.120297 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 10
> 15:06:14.120359 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 11
> 15:06:14.120737 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 11
> 15:06:14.127064 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 8
> 15:06:14.127700 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 9
> 15:06:14.128268 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 10
> 15:06:15.120426 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 12
> 15:06:16.120484 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 13
> 15:06:16.120840 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 13
> 15:06:16.129476 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 12
> 15:06:17.120551 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 14
> 15:06:17.615429 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 14
> 15:06:18.120612 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 15
> 15:06:19.120672 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 16
> 15:06:19.748981 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 15
> 15:06:19.749976 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 16
> 15:06:20.120738 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 17
> 15:06:21.120799 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 18
> 15:06:22.120859 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 19
> 15:06:23.120926 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 20
> 15:06:24.120985 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 21
> 15:06:24.881809 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 17
> 15:06:24.882287 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 18
> 15:06:24.882578 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 19
> 15:06:24.883347 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 20
> 15:06:24.884158 IP 10.128.10.1 > 10.128.10.254: icmp 64: echo reply seq 21
> 15:06:25.129659 IP 10.128.10.254 > 10.128.10.1: icmp 64: echo request seq 22
> 
> Does anyone see any reason why the driver would get into such a state
> after a pile of traffic in a row?  So far I have only seen it when NAPI
> is enabled, so I have to assume so far that it makes a difference to it.

With more beating of the driver, I have managed to see it fail in places
other than descriptor 0.  It is as if the MAC doesn't transfer the
descriptor to the system memory, although I am not yet sure if there is
packet loss involved involving retransmision or not.

I tried making this change as a test:

  static int pcnet32_rx(struct net_device *dev, int quota)
  {
          struct pcnet32_private *lp = dev->priv;
          int entry = lp->cur_rx & lp->rx_mod_mask;
+         int nextentry = (lp->cur_rx+1) & lp->rx_mod_mask;
          struct pcnet32_rx_head *rxp = &lp->rx_ring[entry];
+         struct pcnet32_rx_head *nextrxp = &lp->rx_ring[nextentry];
          int npackets = 0;

+         /* Check for confused MAC and resync if necesary */
+         if ((short)le16_to_cpu(rxp->status) < 0 && (short)le16_to_cpu(nextrxp->status) >= 0) {
+                 entry = (++lp->cur_rx) & lp->rx_mod_mask;
+                 rxp = &lp->rx_ring[entry];
+                 printk("%s: rx_ring bugfix on entry %d\n",dev->name,entry);
+         }

          /* If we own the next entry, it's a new packet. Send it up. */
          while (quota > npackets && (short)le16_to_cpu(rxp->status) >= 0) {
                  pcnet32_rx_entry(dev, lp, rxp, entry);
                  npackets += 1;
                  /*
                   * The docs say that the buffer length isn't touched, but Andrew
                   * Boyd of QNX reports that some revs of the 79C965 clear it.
                   */
                  rxp->buf_length = le16_to_cpu(2 - PKT_BUF_SZ);
                  wmb();  /* Make sure owner changes after others are visible */
                  rxp->status = le16_to_cpu(0x8000);
                  entry = (++lp->cur_rx) & lp->rx_mod_mask;
                  rxp = &lp->rx_ring[entry];
          }

          return npackets;
  }

Now this really can't be a solution.  It still occationally gets "stuck"
for a short time, but with this change it manages to recover again but
it does seem to be hitting by printk once in a while.  I am starting to
wonder of the PCI transfers are failing once in a while.  I still can't
figure out what NAPI is doing different unless some of the pcnet32_rx
code should be run with interrupts disabled and isn't in NAPI mode.

eth1: rx_ring bugfix on entry 28
eth1: rx_ring bugfix on entry 15
eth1: rx_ring bugfix on entry 31
eth1: rx_ring bugfix on entry 2
eth1: rx_ring bugfix on entry 23
eth1: rx_ring bugfix on entry 25
eth1: rx_ring bugfix on entry 2
eth1: rx_ring bugfix on entry 21
eth1: rx_ring bugfix on entry 24
eth1: rx_ring bugfix on entry 27
eth1: rx_ring bugfix on entry 28
eth1: rx_ring bugfix on entry 23
eth1: rx_ring bugfix on entry 11
eth1: rx_ring bugfix on entry 28
eth1: rx_ring bugfix on entry 27
eth1: rx_ring bugfix on entry 24
eth1: rx_ring bugfix on entry 28
eth1: rx_ring bugfix on entry 17

--
Len Sorensen
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ