lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 5 Aug 2008 11:06:53 +0300
From:	Denys Fedoryshchenko <denys@...p.net.lb>
To:	netdev@...r.kernel.org
Subject: Re: thousands of classes, e1000 TX unit hang

A little bit more info:

On oprofile i run on another machine (which doesn't suffer much, but i can 
notice also drops on eth0 after adding around 100 interfaces). On first 
machine clocksources is TSC, on machine where i read stats acpi_pm.

CPU: P4 / Xeon with 2 hyper-threads, speed 3200.53 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not 
stopped) with a unit mask of 0x01 (mandatory) count 100000
GLOBAL_POWER_E...|
  samples|      %|
------------------
   973464 75.7644 vmlinux
    97703  7.6042 libc-2.6.1.so
    36166  2.8148 cls_fw
    18290  1.4235 nf_conntrack
    17946  1.3967 busybox
        GLOBAL_POWER_E...|

PU: P4 / Xeon with 2 hyper-threads, speed 3200.53 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not 
stopped) with a unit mask of 0x01 (mandatory) count 100000
samples  %        symbol name
245545   23.1963  acpi_pm_read
143863   13.5905  __copy_to_user_ll
121269   11.4561  ioread16
58609     5.5367  gen_kill_estimator
40153     3.7932  ioread32
33923     3.2047  ioread8
16491     1.5579  arch_task_cache_init
16067     1.5178  sysenter_past_esp
11604     1.0962  find_get_page
10631     1.0043  est_timer
9038      0.8538  get_page_from_freelist
8681      0.8201  sk_run_filter
8077      0.7630  irq_entries_start
7711      0.7284  schedule
6451      0.6094  copy_to_user




On Tuesday 05 August 2008, Denys Fedoryshchenko wrote:
> I did script, that looks something like this (to simulate SFQ by flow
> classifier):
>
> $2 (is ppp interface)
> echo "qdisc del dev $2 root ">>${TEMP}
> echo "qdisc add dev $2 root handle 1: htb ">>${TEMP}
>  echo "filter add dev $2 protocol ip pref 16 parent 1: u32 \
> 	match ip dst 0.0.0.0/0 police rate 8kbit burst 2048kb \
> 	peakrate 1024Kbit mtu 10000 \
> 	conform-exceed continue/ok">>${TEMP}
>
> echo "filter add dev $2 protocol ip pref 32 parent 1: handle 1 \
> 	flow hash keys nfct divisor 128 baseclass 1:2">>${TEMP}
>
> echo "class add dev $2 parent 1: classid 1:1 htb \
> 	rate ${rate}bit ceil ${rate}Kbit quantum 1514">>${TEMP}
>
> #Cycle to add 128 classes
> maxslot=130
> for slot in `seq 2 $maxslot`; do
> echo "class add dev $2 parent 1:1 classid 1:$slot htb \
> 	rate 8Kbit ceil 256Kbit quantum 1514">>${TEMP}
> echo "qdisc add dev $2 handle $slot: parent 1:$slot bfifo limit
> 3000">>${TEMP} done
>
> After adding around 400-450 interfaces (ppp) server start to "crack". Sure
> there is packetloss to eth0 (but there is no filters or shapers on it).
> Even deleting all classes becomes a challenge. After deleting all root
> handles on ppp interfaces - it becomes ok.
>
>
> Traffic over host is 15-20Mbit/s at that moment, it is 1 CPU Xeon 3.0 Ghz
> on server motherboard SE7520 with 1GB ram available (at moment of testing
> more than 512Mb was free).
>
> Kernel is 2.6.26.1-vanilla
> Anything else i need to add to info?
>
> Error message appearing in dmesg:
> [149650.006939] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> [149650.006943]   Tx Queue             <0>
> [149650.006944]   TDH                  <a3>
> [149650.006945]   TDT                  <a3>
> [149650.006947]   next_to_use          <a3>
> [149650.006948]   next_to_clean        <f8>
> [149650.006949] buffer_info[next_to_clean]
> [149650.006951]   time_stamp           <8e69a7c>
> [149650.006952]   next_to_watch        <f8>
> [149650.006953]   jiffies              <8e6a111>
> [149650.006954]   next_to_watch.status <1>
> [149655.964100] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> [149655.964104]   Tx Queue             <0>
> [149655.964105]   TDH                  <6c>
> [149655.964107]   TDT                  <6c>
> [149655.964108]   next_to_use          <6c>
> [149655.964109]   next_to_clean        <c1>
> [149655.964111] buffer_info[next_to_clean]
> [149655.964112]   time_stamp           <8e6b198>
> [149655.964113]   next_to_watch        <c1>
> [149655.964115]   jiffies              <8e6b853>
> [149655.964116]   next_to_watch.status <1>
> [149666.765110] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> [149666.765110]   Tx Queue             <0>
> [149666.765110]   TDH                  <28>
> [149666.765110]   TDT                  <28>
> [149666.765110]   next_to_use          <28>
> [149666.765110]   next_to_clean        <7e>
> [149666.765110] buffer_info[next_to_clean]
> [149666.765110]   time_stamp           <8e6db6a>
> [149666.765110]   next_to_watch        <7e>
> [149666.765110]   jiffies              <8e6e27f>
> [149666.765110]   next_to_watch.status <1>
> [149668.629051] e1000: eth1: e1000_clean_tx_irq: Detected Tx Unit Hang
> [149668.629056]   Tx Queue             <0>
> [149668.629058]   TDH                  <1b>
> [149668.629060]   TDT                  <1b>
> [149668.629062]   next_to_use          <1b>
> [149668.629064]   next_to_clean        <f1>
> [149668.629066] buffer_info[next_to_clean]
> [149668.629068]   time_stamp           <8e6e4c3>
> [149668.629070]   next_to_watch        <f1>
> [149668.629072]   jiffies              <8e6e9c7>
> [149668.629074]   next_to_watch.status <1>
> [149676.606031] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> [149676.606035]   Tx Queue             <0>
> [149676.606037]   TDH                  <9b>
> [149676.606038]   TDT                  <9b>
> [149676.606039]   next_to_use          <9b>
> [149676.606040]   next_to_clean        <f0>
> [149676.606042] buffer_info[next_to_clean]
> [149676.606043]   time_stamp           <8e7024c>
> [149676.606044]   next_to_watch        <f0>
> [149676.606046]   jiffies              <8e708eb>
> [149676.606047]   next_to_watch.status <1>
> [149680.151750] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> [149680.151750]   Tx Queue             <0>
> [149680.151750]   TDH                  <84>
> [149680.151750]   TDT                  <84>
> [149680.151750]   next_to_use          <84>
> [149680.151750]   next_to_clean        <d9>
> [149680.151750] buffer_info[next_to_clean]
> [149680.151750]   time_stamp           <8e7100d>
> [149680.151750]   next_to_watch        <d9>
> [149680.151750]   jiffies              <8e716c3>
> [149680.151750]   next_to_watch.status <1>
> [149680.153751] e1000: eth1: e1000_clean_tx_irq: Detected Tx Unit Hang
> [149680.153751]   Tx Queue             <0>
> [149680.153751]   TDH                  <aa>
> [149680.153751]   TDT                  <d2>
> [149680.153751]   next_to_use          <d2>
> [149680.153751]   next_to_clean        <2d>
> [149680.153751] buffer_info[next_to_clean]
> [149680.153751]   time_stamp           <8e710db>
> [149680.153751]   next_to_watch        <2d>
> [149680.153751]   jiffies              <8e716c5>
> [149680.153751]   next_to_watch.status <1>
> [149702.565549] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> [149702.565549]   Tx Queue             <0>
> [149702.565549]   TDH                  <3c>
> [149702.565549]   TDT                  <3c>
> [149702.565549]   next_to_use          <3c>
> [149702.565549]   next_to_clean        <91>
> [149702.565549] buffer_info[next_to_clean]
> [149702.565549]   time_stamp           <8e7676e>
> [149702.565549]   next_to_watch        <91>
> [149702.565549]   jiffies              <8e76e48>
> [149702.565549]   next_to_watch.status <1>
> [149708.020581] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> [149708.020581]   Tx Queue             <0>
> [149708.020581]   TDH                  <4c>
> [149708.020581]   TDT                  <4c>
> [149708.020581]   next_to_use          <4c>
> [149708.020581]   next_to_clean        <a1>
> [149708.020581] buffer_info[next_to_clean]
> [149708.020581]   time_stamp           <8e77cc3>
> [149708.020581]   next_to_watch        <a1>
> [149708.020581]   jiffies              <8e78394>
> [149708.020581]   next_to_watch.status <1>
> [149713.864829] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> [149713.864833]   Tx Queue             <0>
> [149713.864835]   TDH                  <b0>
> [149713.864836]   TDT                  <b0>
> [149713.864837]   next_to_use          <b0>
> [149713.864839]   next_to_clean        <5>
> [149713.864840] buffer_info[next_to_clean]
> [149713.864841]   time_stamp           <8e7937b>
> [149713.864842]   next_to_watch        <5>
> [149713.864844]   jiffies              <8e79a64>
> [149713.864845]   next_to_watch.status <1>
> [149759.710721] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> [149759.710726]   Tx Queue             <0>
> [149759.710729]   TDH                  <88>
> [149759.710730]   TDT                  <88>
> [149759.710732]   next_to_use          <88>
> [149759.710734]   next_to_clean        <dd>
> [149759.710736] buffer_info[next_to_clean]
> [149759.710738]   time_stamp           <8e8465c>
> [149759.710740]   next_to_watch        <dd>
> [149759.710742]   jiffies              <8e84d6f>
> [149759.710744]   next_to_watch.status <1>
> [149759.712712] e1000: eth1: e1000_clean_tx_irq: Detected Tx Unit Hang
> [149759.712715]   Tx Queue             <0>
> [149759.712717]   TDH                  <84>
> [149759.712719]   TDT                  <90>
> [149759.712721]   next_to_use          <90>
> [149759.712723]   next_to_clean        <e5>
> [149759.712725] buffer_info[next_to_clean]
> [149759.712726]   time_stamp           <8e84782>
> [149759.712728]   next_to_watch        <e5>
> [149759.712730]   jiffies              <8e84d71>
> [149759.712732]   next_to_watch.status <1>
> [149768.334753] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> [149768.334757]   Tx Queue             <0>
> [149768.334758]   TDH                  <92>
> [149768.334760]   TDT                  <92>
> [149768.334761]   next_to_use          <92>
> [149768.334762]   next_to_clean        <e7>
> [149768.334764] buffer_info[next_to_clean]
> [149768.334765]   time_stamp           <8e86829>
> [149768.334766]   next_to_watch        <e7>
> [149768.334767]   jiffies              <8e86f1c>
> [149768.334769]   next_to_watch.status <1>
> [149776.537825] e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
> [149776.537825]   Tx Queue             <0>
> [149776.537825]   TDH                  <4e>
> [149776.537825]   TDT                  <4e>
> [149776.537825]   next_to_use          <4e>
> [149776.537825]   next_to_clean        <a3>
> [149776.537825] buffer_info[next_to_clean]
> [149776.537825]   time_stamp           <8e8882b>
> [149776.537825]   next_to_watch        <a3>
> [149776.537825]   jiffies              <8e88f21>
> [149776.537825]   next_to_watch.status <1>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ