lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 5 May 2009 20:41:35 +0300
From:	Vladimir Ivashchenko <hazard@...ncoudi.com>
To:	Eric Dumazet <dada1@...mosbay.com>
Cc:	netdev@...r.kernel.org
Subject: Re: bond + tc regression ?

> > On both kernels, the system is running with at least 70% idle CPU.
> > The network interrupts are distributed accross the cores.
> 
> You should not distribute interrupts, but bound a NIC to one CPU

Kernels 2.6.28 and 2.6.29 do this by default, so I thought its correct.
The defaults are wrong?

I have tried with IRQs bound to one CPU per NIC. Same result.

> > I thought it was a e1000e driver issue, but tweaking e1000e ring buffers
> > didn't help. I tried using e1000 on 2.6.28 by adding necessary PCI IDs,
> > I tried running on a different server with bnx cards, I tried disabling
> > NO_HZ and HRTICK, but still I have the same problem.
> > 
> > However, if I don't utilize bond, but just apply rules on normal ethX
> > interfaces, there is no packet loss with 2.6.28/29. 
> > 
> > So, the problem appears only when I use 2.6.28/29 + bond + classful tc
> > combination. 
> > 
> > Any ideas ?
> > 
> 
> Yes, we need much more information :)
> Is it a forwarding setup only ?

Yes, the server is doing nothing else but forwarding, no iptables.

> cat /proc/interrupts

           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7
  0:        130          0          0          0          0          0          0          0   IO-APIC-edge      timer
  1:          2          0          0          0          0          0          0          0   IO-APIC-edge      i8042
  3:          0          0          0          1          0          1          0          0   IO-APIC-edge
  4:          0          0          1          0          0          0          1          0   IO-APIC-edge
  9:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   acpi
 12:          4          0          0          0          0          0          0          0   IO-APIC-edge      i8042
 14:          0          0          0          0          0          0          0          0   IO-APIC-edge      ata_piix
 15:          0          0          0          0          0          0          0          0   IO-APIC-edge      ata_piix
 17:      30901      31910      31446      30655      31618      30550      31543      30958   IO-APIC-fasteoi   aacraid
 20:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
 21:          0          0          0          0          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb5, ahci
 22:     298387     297642     295508     294368     295533     295430     295275     296036   IO-APIC-fasteoi   ehci_hcd:usb1, uhci_hcd:usb2
 23:      10868      10926      10980      10738      10939      10615      10761      10909   IO-APIC-fasteoi   uhci_hcd:usb3
 57: 1486251823 1486835830 1486677250 1487105983 1488000303 1485941815 1487728317 1486624997   PCI-MSI-edge      eth0
 58: 1510676329 1509708161 1510347202 1509969755 1508599471 1511220118 1509094578 1509727616   PCI-MSI-edge      eth1
 59: 1482578890 1483618556 1482963700 1483164528 1484561615 1482130645 1484116749 1483557717   PCI-MSI-edge      eth2
 60: 1507341647 1506685822 1506862759 1506612818 1505689367 1507559672 1505911622 1506940613   PCI-MSI-edge      eth3
NMI:          0          0          0          0          0          0          0          0   Non-maskable interrupts
LOC: 1020533656 1020535165 1020533613 1020534967 1020535173 1020534409 1020534985 1020534220   Local timer interrupts
RES:      18605      21215      15957      18637      22429      19493      16649      15589   Rescheduling interrupts
CAL:        160        214        186        185        199        205        190        180   Function call interrupts
TLB:     259515     264126     309016     312222     263163     265601     306189     305430   TLB shootdowns
TRM:          0          0          0          0          0          0          0          0   Thermal event interrupts
SPU:          0          0          0          0          0          0          0          0   Spurious interrupts
ERR:          0
MIS:          0

> tc -s -d qdisc

For test sake, I just put "tc qdisc add dev $IFACE root handle 1: prio" and no filters at all. 
I get the same with HTB "tc qdisc add dev $IFACE root handle 1: htb default 99" and no subclasses.

qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 13287736273644 bytes 1263672018 pkt (dropped 0, overlimits 0 requeues 2928480094)
 rate 0bit 0pps backlog 0b 0p requeues 2928480094
qdisc pfifo_fast 0: dev eth1 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 40064376195000 bytes 1747026586 pkt (dropped 0, overlimits 0 requeues 463621814)
 rate 0bit 0pps backlog 0b 0p requeues 463621814
qdisc pfifo_fast 0: dev eth2 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 13350145517965 bytes 1350897201 pkt (dropped 0, overlimits 0 requeues 2930879507)
 rate 0bit 0pps backlog 0b 0p requeues 2930879507
qdisc pfifo_fast 0: dev eth3 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 40193456126884 bytes 1950653764 pkt (dropped 0, overlimits 0 requeues 465511120)
 rate 0bit 0pps backlog 0b 0p requeues 465511120
qdisc prio 1: dev bond0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 985164834 bytes 2720991 pkt (dropped 241834, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
qdisc prio 1: dev bond1 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 2347118738 bytes 3089171 pkt (dropped 304601, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0

** Drops on bond0/bond1 are increasing by approximately 5000 per second:

qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 13287874353796 bytes 1264050808 pkt (dropped 0, overlimits 0 requeues 2928520779)
 rate 0bit 0pps backlog 0b 0p requeues 2928520779
qdisc pfifo_fast 0: dev eth1 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 40064706826018 bytes 1747459793 pkt (dropped 0, overlimits 0 requeues 463669610)
 rate 0bit 0pps backlog 0b 0p requeues 463669610
qdisc pfifo_fast 0: dev eth2 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 13350283202695 bytes 1351277761 pkt (dropped 0, overlimits 0 requeues 2930918488)
 rate 0bit 0pps backlog 0b 0p requeues 2930918488
qdisc pfifo_fast 0: dev eth3 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 40193784868074 bytes 1951084029 pkt (dropped 0, overlimits 0 requeues 465558015)
 rate 0bit 0pps backlog 0b 0p requeues 465558015
qdisc prio 1: dev bond0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 1260929539 bytes 3480340 pkt (dropped 311145, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
qdisc prio 1: dev bond1 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 3006490946 bytes 3952643 pkt (dropped 396850, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0

With same setup on 2.6.23, drops are increasing only by 50/sec or so.

As soon as I do "tc qdisc del dev $IFACE root", packet loss stops.

> cat /proc/net/bonding/bond0

Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
MII Status: up
MII Polling Interval (ms): 80
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
        Aggregator ID: 1
        Number of ports: 2
        Actor Key: 17
        Partner Key: 4
        Partner Mac Address: 00:19:e7:b2:07:80

Slave Interface: eth0
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:1b:24:bd:e9:cc
Aggregator ID: 1

Slave Interface: eth2
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:1b:24:bd:e9:ce
Aggregator ID: 1

> cat /proc/net/bonding/bond1

Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
MII Status: up
MII Polling Interval (ms): 80
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
        Aggregator ID: 2
        Number of ports: 2
        Actor Key: 17
        Partner Key: 5
        Partner Mac Address: 00:19:e7:b2:07:80

Slave Interface: eth1
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:1b:24:bd:e9:cd
Aggregator ID: 2

Slave Interface: eth3
MII Status: up
Link Failure Count: 2
Permanent HW addr: 00:1b:24:bd:e9:cf
Aggregator ID: 2


> mpstat -P ALL 10

08:04:36 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
08:04:46 PM  all    0.00    0.00    0.01    0.00    0.00    1.05    0.00   98.94  70525.73
08:04:46 PM    0    0.00    0.00    0.00    0.00    0.00    0.70    0.00   99.30   7814.41
08:04:46 PM    1    0.00    0.00    0.00    0.00    0.00    2.10    0.00   97.90   7814.41
08:04:46 PM    2    0.00    0.00    0.00    0.00    0.00    0.20    0.00   99.80   7814.41
08:04:46 PM    3    0.00    0.00    0.10    0.00    0.00    1.30    0.00   98.60   7814.51
08:04:46 PM    4    0.00    0.00    0.00    0.00    0.00    0.50    0.00   99.50   7814.41
08:04:46 PM    5    0.00    0.00    0.00    0.00    0.00    1.90    0.00   98.10   7814.41
08:04:46 PM    6    0.00    0.00    0.00    0.00    0.00    0.60    0.00   99.40   7814.41
08:04:46 PM    7    0.00    0.00    0.10    0.00    0.00    0.90    0.00   99.00   7814.51
08:04:46 PM    8    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00

08:04:46 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
08:04:56 PM  all    0.00    0.00    0.01    0.00    0.00    1.49    0.00   98.50  66429.30
08:04:56 PM    0    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00   7303.50
08:04:56 PM    1    0.00    0.00    0.00    0.00    0.00    1.60    0.00   98.40   7303.50
08:04:56 PM    2    0.00    0.00    0.00    0.00    0.00    1.20    0.00   98.80   7303.50
08:04:56 PM    3    0.00    0.00    0.00    0.00    0.00    3.20    0.00   96.80   7303.40
08:04:56 PM    4    0.00    0.00    0.00    0.00    0.00    1.90    0.00   98.10   7303.60
08:04:56 PM    5    0.00    0.00    0.00    0.00    0.00    1.20    0.00   98.80   7303.50
08:04:56 PM    6    0.00    0.00    0.10    0.00    0.00    1.80    0.00   98.10   7303.50
08:04:56 PM    7    0.00    0.00    0.00    0.00    0.00    1.20    0.00   98.80   7303.50
08:04:56 PM    8    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00      0.00

> ifconfig -a

bond0     Link encap:Ethernet  HWaddr 00:1B:24:BD:E9:CC
          inet addr:xxx.xxx.135.44  Bcast:xxx.xxx.135.47  Mask:255.255.255.248
          inet6 addr: fe80::21b:24ff:febd:e9cc/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:436076190 errors:0 dropped:391250 overruns:0 frame:0
          TX packets:2620156321 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:4210046233 (3.9 GiB)  TX bytes:2520272242 (2.3 GiB)

bond1     Link encap:Ethernet  HWaddr 00:1B:24:BD:E9:CD
          inet addr:xxx.xxx.70.156  Bcast:xxx.xxx.70.159  Mask:255.255.255.248
          inet6 addr: fe80::21b:24ff:febd:e9cd/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:239471641 errors:0 dropped:344 overruns:0 frame:0
          TX packets:3704083902 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:2488754745 (2.3 GiB)  TX bytes:2685275089 (2.5 GiB)

eth0      Link encap:Ethernet  HWaddr 00:1B:24:BD:E9:CC
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:2235085582 errors:0 dropped:353786 overruns:0 frame:0
          TX packets:1266449269 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:3768096439 (3.5 GiB)  TX bytes:113363829 (108.1 MiB)
          Memory:fc6e0000-fc700000

eth1      Link encap:Ethernet  HWaddr 00:1B:24:BD:E9:CD
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:4228974804 errors:0 dropped:344 overruns:0 frame:0
          TX packets:1750216649 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:3350270261 (3.1 GiB)  TX bytes:3358220645 (3.1 GiB)
          Memory:fc6c0000-fc6e0000

eth2      Link encap:Ethernet  HWaddr 00:1B:24:BD:E9:CC
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:2495958020 errors:0 dropped:37464 overruns:0 frame:0
          TX packets:1353707165 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:442055526 (421.5 MiB)  TX bytes:2406943933 (2.2 GiB)
          Memory:fcde0000-fce00000

eth3      Link encap:Ethernet  HWaddr 00:1B:24:BD:E9:CD
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:305464222 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1953867360 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:3433479245 (3.1 GiB)  TX bytes:3622113909 (3.3 GiB)
          Memory:fcd80000-fcda0000

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:53537 errors:0 dropped:0 overruns:0 frame:0
          TX packets:53537 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:431006433 (411.0 MiB)  TX bytes:431006433 (411.0 MiB)


NOTE: ifconfig drops on bond0/bond1 are *NOT* increasing. These drops are there from before.

-- 
Best Regards
Vladimir Ivashchenko
Chief Technology Officer
PrimeTel, Cyprus - www.prime-tel.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ