lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL8zT=gNtcdyyVcPt5hB6jyF1btzQArEuZgVKWkb0Wd=a4LcVA@mail.gmail.com>
Date:	Tue, 12 Jun 2012 10:24:03 +0200
From:	Jean-Michel Hautbois <jhautbois@...il.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Sathya.Perla@...lex.com, netdev@...r.kernel.org
Subject: Re: Difficulties to get 1Gbps on be2net ethernet card

2012/6/8 Jean-Michel Hautbois <jhautbois@...il.com>:
> 2012/6/8 Eric Dumazet <eric.dumazet@...il.com>:
>> On Fri, 2012-06-08 at 10:14 +0200, Jean-Michel Hautbois wrote:
>>> 2012/6/8 Eric Dumazet <eric.dumazet@...il.com>:
>>> > On Thu, 2012-06-07 at 14:54 +0200, Jean-Michel Hautbois wrote:
>>> >
>>> >> eth1      Link encap:Ethernet  HWaddr 68:b5:99:b9:8d:d4
>>> >>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:4096  Metric:1
>>> >>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>> >>           TX packets:15215387 errors:0 dropped:0 overruns:0 carrier:0
>>> >>           collisions:0 txqueuelen:1000
>>> >>           RX bytes:0 (0.0 B)  TX bytes:61476524359 (57.2 GiB)
>>> >
>>> >> qdisc mq 0: dev eth1 root
>>> >>  Sent 61476524359 bytes 15215387 pkt (dropped 45683472, overlimits 0
>>> >> requeues 17480)
>>> >
>>> > OK, and "tc -s -d cl show dev eth1"
>>> >
>>> > (How many queues are really used)
>>> >
>>> >
>>> >
>>>
>>> tc -s -d cl show dev eth1
>>> class mq :1 root
>>>  Sent 9798071746 bytes 2425410 pkt (dropped 3442405, overlimits 0 requeues 2747)
>>>  backlog 0b 0p requeues 2747
>>> class mq :2 root
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> class mq :3 root
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> class mq :4 root
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> class mq :5 root
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> class mq :6 root
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> class mq :7 root
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>> class mq :8 root
>>>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>>>  backlog 0b 0p requeues 0
>>
>>
>> Do you have the same distribution on old kernels as well ?
>> (ie only queue 0 is used)
>>
>>
>>
>
> On the old kernel, there is nothing returned by this command.
>
> JM

I used perf in order to get more information.
Here is the perf record -a sleep 10 result (I took only kernel) :
     6.93%    ModuleTester  [kernel.kallsyms]                 [k]
copy_user_generic_string
     2.99%         swapper  [kernel.kallsyms]                 [k] mwait_idle
     2.60%          kipmi0  [ipmi_si]                         [k] port_inb
     1.75%         swapper  [kernel.kallsyms]                 [k] rb_prev
     1.63%    ModuleTester  [kernel.kallsyms]                 [k] _raw_spin_lock
     1.43%     NodeManager  [kernel.kallsyms]                 [k] delay_tsc
     0.90%    ModuleTester  [kernel.kallsyms]                 [k] clear_page_c
     0.88%    ModuleTester  [kernel.kallsyms]                 [k] dev_queue_xmit
     0.80%              ip  [kernel.kallsyms]                 [k]
snmp_fold_field
     0.73%    ModuleTester  [kernel.kallsyms]                 [k]
clflush_cache_range
     0.69%            grep  [kernel.kallsyms]                 [k] page_fault
     0.61%              sh  [kernel.kallsyms]                 [k] page_fault
     0.59%    ModuleTester  [kernel.kallsyms]                 [k] udp_sendmsg
     0.55%    ModuleTester  [kernel.kallsyms]                 [k] _raw_read_lock
     0.53%              sh  [kernel.kallsyms]                 [k] unmap_vmas
     0.52%    ModuleTester  [kernel.kallsyms]                 [k] rb_prev
     0.51%    ModuleTester  [kernel.kallsyms]                 [k]
find_busiest_group
     0.49%    ModuleTester  [kernel.kallsyms]                 [k] __ip_make_skb
     0.48%    ModuleTester  [kernel.kallsyms]                 [k]
sock_alloc_send_pskb
     0.48%    ModuleTester  libpthread-2.7.so                 [.]
pthread_mutex_lock
     0.47%    ModuleTester  [kernel.kallsyms]                 [k]
__netif_receive_skb
     0.44%              ip  [kernel.kallsyms]                 [k] find_next_bit
     0.43%         swapper  [kernel.kallsyms]                 [k]
clflush_cache_range
     0.41%              ps  [kernel.kallsyms]                 [k] format_decode
     0.41%    ModuleTester  [bonding]                         [k]
bond_start_xmit
     0.39%    ModuleTester  [be2net]                          [k] be_xmit
     0.39%    ModuleTester  [kernel.kallsyms]                 [k]
__ip_append_data
     0.38%    ModuleTester  [kernel.kallsyms]                 [k] netif_rx
     0.37%         swapper  [be2net]                          [k] be_poll
     0.37%         swapper  [kernel.kallsyms]                 [k] ktime_get
     0.37%              sh  [kernel.kallsyms]                 [k] copy_page_c
     0.36%         swapper  [kernel.kallsyms]                 [k]
irq_entries_start
     0.36%    ModuleTester  [kernel.kallsyms]                 [k]
__alloc_pages_nodemask
     0.35%    ModuleTester  [kernel.kallsyms]                 [k] __slab_free
     0.35%    ModuleTester  [kernel.kallsyms]                 [k] ip_mc_output
     0.34%    ModuleTester  [kernel.kallsyms]                 [k]
skb_release_data
     0.33%              ip  [kernel.kallsyms]                 [k] page_fault
     0.33%    ModuleTester  [kernel.kallsyms]                 [k] udp_send_skb

And here is the perf record -a result without bonding :
     2.49%     ModuleTester  [kernel.kallsyms]               [k]
csum_partial_copy_generic
     1.35%     ModuleTester  [kernel.kallsyms]               [k] _raw_spin_lock
     1.29%     ModuleTester  [kernel.kallsyms]               [k]
clflush_cache_range
     1.16%       jobprocess  [kernel.kallsyms]               [k] rb_prev
     1.01%       jobprocess  [kernel.kallsyms]               [k]
clflush_cache_range
     0.81%     ModuleTester  [be2net]                        [k] be_xmit
     0.78%       jobprocess  [kernel.kallsyms]               [k] __slab_free
     0.77%          swapper  [kernel.kallsyms]               [k] mwait_idle
     0.72%     ModuleTester  [kernel.kallsyms]               [k]
__domain_mapping
     0.66%       jobprocess  [kernel.kallsyms]               [k] _raw_spin_lock
     0.59%       jobprocess  [kernel.kallsyms]               [k]
_raw_spin_lock_irqsave
     0.56%     ModuleTester  [kernel.kallsyms]               [k] rb_prev
     0.53%          swapper  [kernel.kallsyms]               [k] rb_prev
     0.49%     ModuleTester  [kernel.kallsyms]               [k] sock_wmalloc
     0.47%       jobprocess  [be2net]                        [k] be_poll
     0.47%     ModuleTester  [kernel.kallsyms]               [k]
kmem_cache_alloc
     0.47%          swapper  [kernel.kallsyms]               [k]
clflush_cache_range
     0.45%           kipmi0  [ipmi_si]                       [k] port_inb
     0.42%          swapper  [kernel.kallsyms]               [k] __slab_free
     0.41%       jobprocess  [kernel.kallsyms]               [k] try_to_wake_up
     0.40%     ModuleTester  [kernel.kallsyms]               [k]
kmem_cache_alloc_node
     0.40%       jobprocess  [kernel.kallsyms]               [k] tg_load_down
     0.39%       jobprocess  libodyssey.so.1.8.2             [.]
y8_deblocking_luma_vert_edge_h264_sse2
     0.38%       jobprocess  libodyssey.so.1.8.2             [.]
y8_deblocking_luma_horz_edge_h264_ssse3
     0.38%     ModuleTester  [kernel.kallsyms]               [k] rb_insert_color
     0.37%       jobprocess  [kernel.kallsyms]               [k] find_iova
     0.37%       jobprocess  [kernel.kallsyms]               [k]
find_busiest_group
     0.36%       jobprocess  libpthread-2.7.so               [.]
pthread_mutex_lock
     0.35%          swapper  [kernel.kallsyms]               [k]
_raw_spin_unlock_irqrestore
     0.34%     ModuleTester  [kernel.kallsyms]               [k]
_raw_spin_lock_irqsave
     0.33%     ModuleTester  [kernel.kallsyms]               [k]
pfifo_fast_dequeue
     0.32%     ModuleTester  [kernel.kallsyms]               [k]
__kmalloc_node_track_caller
     0.32%       jobprocess  [be2net]                        [k]
be_tx_compl_process
     0.31%     ModuleTester  [kernel.kallsyms]               [k] ip_fragment
     0.29%          swapper  [kernel.kallsyms]               [k]
__hrtimer_start_range_ns
     0.29%       jobprocess  [kernel.kallsyms]               [k] __schedule
     0.29%     ModuleTester  [kernel.kallsyms]               [k] dev_queue_xmit
     0.28%          swapper  [kernel.kallsyms]               [k] __schedule

First thing I notice is the difference in copy_user_generic_string (it
is only 0.11% on the second measure, I didn't reported it here).
I think perf can help in finding the issue I observe with bonding,
maybe do you have  suggestions on the parameters to use ?
FYI, with bonding, TX goes up to 640Mbps, without bonding, I can send
2.4Gbps without suffering...

JM
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ