[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YzFM8RF0suHc4cKI@unreal>
Date: Mon, 26 Sep 2022 09:55:45 +0300
From: Leon Romanovsky <leon@...nel.org>
To: Steffen Klassert <steffen.klassert@...unet.com>
Cc: Jakub Kicinski <kuba@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Herbert Xu <herbert@...dor.apana.org.au>,
netdev@...r.kernel.org, Paolo Abeni <pabeni@...hat.com>,
Raed Salem <raeds@...dia.com>,
Saeed Mahameed <saeedm@...dia.com>,
Bharat Bhushan <bbhushan2@...vell.com>
Subject: Re: [PATCH RFC xfrm-next v3 0/8] Extend XFRM core to allow full
offload configuration
On Sun, Sep 25, 2022 at 11:40:39AM +0200, Steffen Klassert wrote:
> On Wed, Sep 21, 2022 at 08:37:06PM +0300, Leon Romanovsky wrote:
> > On Wed, Sep 21, 2022 at 07:59:27AM -0700, Jakub Kicinski wrote:
> > > On Thu, 8 Sep 2022 12:56:16 +0300 Leon Romanovsky wrote:
> > > > I have TX traces too and can add if RX are not sufficient.
> > >
> > > The perf trace is good, but for those of us not intimately familiar
> > > with xfrm, could you provide some analysis here?
> >
> > The perf trace presented is for RX path of IPsec crypto offload mode. In that
> > mode, decrypted packet enters the netdev stack to perform various XFRM specific
> > checks.
>
> Can you provide the perf traces and analysis for the TX side too? That
> would be interesting in particular, because the policy and state lookups
> there happen still in software.
Single core TX (crypto mode) from the same run:
Please notice that it is not really bottleneck, probably RX caused to the situation
where TX was not executed enough. It is also lighter path than RX.
# Children Self Samples Command Shared Object Symbol
# ........ ........ ............ ............... .................. ..............................................
#
86.58% 0.00% 0 swapper [kernel.vmlinux] [k] secondary_startup_64_no_verify
|
---secondary_startup_64_no_verify
start_secondary
cpu_startup_entry
do_idle
|
--86.37%--cpu_idle_poll
|
--24.53%--asm_common_interrupt
|
--24.48%--common_interrupt
|
|--23.47%--irq_exit_rcu
| |
| --23.23%--do_softirq_own_stack
| |
| --23.17%--asm_call_irq_on_stack
| __do_softirq
| |
| |--22.23%--net_rx_action
| | |
| | |--20.17%--gro_cell_poll
| | | |
| | | --20.02%--napi_complete_done
| | | |
| | | --19.98%--gro_normal_list.part.154
| | | |
| | | --19.96%--netif_receive_skb_list_internal
| | | |
| | | --19.89%--__netif_receive_skb_list_core
| | | |
| | | --19.77%--ip_list_rcv
| | | |
| | | --19.67%--ip_sublist_rcv
| | | |
| | | --19.56%--ip_sublist_rcv_finish
| | | |
| | | --19.54%--ip_local_deliver
| | | |
| | | --19.49%--ip_local_deliver_finish
| | | |
| | | --19.47%--ip_protocol_deliver_rcu
| | | |
| | | --19.43%--tcp_v4_rcv
| | | |
| | | --18.87%--tcp_v4_do_rcv
| | | |
| | | --18.83%--tcp_rcv_established
| | | |
| | | |--16.41%--__tcp_push_pending_frames
| | | | |
| | | | --16.38%--tcp_write_xmit
| | | | |
| | | | |--6.35%--tcp_event_new_data_sent
| | | | | |
| | | | | --6.22%--sk_reset_timer
| | | | | |
| | | | | --6.21%--mod_timer
| | | | | |
| | | | | --6.10%--get_nohz_timer_target
| | | | | |
| | | | | --1.87%--cpumask_next_and
| | | | | |
| | | | | --1.07%--_find_next_bit.constprop.1
| | | | |
| | | | |--5.50%--tcp_schedule_loss_probe
| | | | | |
| | | | | --5.49%--sk_reset_timer
| | | | | mod_timer
| | | | | |
| | | | | --5.43%--get_nohz_timer_target
| | | | | |
| | | | | --1.37%--cpumask_next_and
| | | | | |
| | | | | --0.71%--_find_next_bit.constprop.1
| | | | |
| | | | --4.31%--__tcp_transmit_skb
| | | | |
| | | | --3.87%--__ip_queue_xmit
| | | | |
| | | | --3.54%--xfrm4_output
| | | | |
| | | | --3.26%--xfrm_output_resume
| | | | |
| | | | --2.88%--ip_output
| | | | |
| | | | --2.78%--ip_finish_output2
| | | | |
| | | | --2.73%--__dev_queue_xmit
| | | | |
| | | | --2.49%--sch_direct_xmit
| | | | |
| | | | |--1.50%--validate_xmit_skb_list
| | | | | | |
| | | | | --1.32%--validate_xmit_skb
| | | | | | |
| | | | | --1.06%--__skb_gso_segment
| | | | | | |
| | | | | --1.04%--skb_mac_gso_segment
| | | | | |
| | | | --1.02%--inet_gso_segment
| | | | | |
| | | | --0.93%--esp4_gso_segment
| | | | | |
| | | | --0.86%--tcp_gso_segment
| | | | |
| | | --0.78%--skb_segment
| | | | |
| | | | --0.77%--dev_hard_start_xmit
| | | | | |
| | | | --0.75%--mlx5e_xmit
| | | |
| | | --1.87%--tcp_ack
| | | |
| | | --1.66%--tcp_clean_rtx_queue
| | | |
| | | --1.35%--__kfree_skb
| | | |
| | | --1.21%--skb_release_data
| | |
| | --1.92%--mlx5e_napi_poll
| | |
| | --1.38%--mlx5e_poll_rx_cq
| | |
| | --1.33%--mlx5e_handle_rx_cqe
| | |
| | --0.53%--napi_gro_receive
| | |
| | --0.52%--dev_gro_receive
| |
| --0.77%--tasklet_action_common.isra.17
|
--0.80%--asm_call_irq_on_stack
|
--0.78%--handle_edge_irq
|
--0.74%--handle_irq_event
|
--0.71%--handle_irq_event_percpu
|
--0.64%--__handle_irq_event_percpu
|
--0.60%--mlx5_irq_int_handler
|
--0.58%--atomic_notifier_call_chain
|
--0.57%--mlx5_eq_comp_int
Powered by blists - more mailing lists