[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240314025720.GA13853@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net>
Date: Wed, 13 Mar 2024 19:57:20 -0700
From: Shradha Gupta <shradhagupta@...ux.microsoft.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Haiyang Zhang <haiyangz@...rosoft.com>,
Shradha Gupta <shradhagupta@...rosoft.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Ajay Sharma <sharmaajay@...rosoft.com>,
Leon Romanovsky <leon@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
KY Srinivasan <kys@...rosoft.com>, Wei Liu <wei.liu@...nel.org>,
Dexuan Cui <decui@...rosoft.com>, Long Li <longli@...rosoft.com>,
Michael Kelley <mikelley@...rosoft.com>
Subject: Re: [PATCH] net :mana : Add per-cpu stats for MANA device
On Sun, Mar 10, 2024 at 09:19:50PM -0700, Shradha Gupta wrote:
> On Fri, Mar 08, 2024 at 11:22:44AM -0800, Jakub Kicinski wrote:
> > On Fri, 8 Mar 2024 18:51:58 +0000 Haiyang Zhang wrote:
> > > > Dynamic is a bit of an exaggeration, right? On a well-configured system
> > > > each CPU should use a single queue assigned thru XPS. And for manual
> > > > debug bpftrace should serve the purpose quite well.
> > >
> > > Some programs, like irqbalancer can dynamically change the CPU affinity,
> > > so we want to add the per-CPU counters for better understanding of the CPU
> > > usage.
> >
> > Do you have experimental data showing this making a difference
> > in production?
> Sure, will try to get that data for this discussion
> >
> > Seems unlikely, but if it does work we should enable it for all
> > devices, no driver by driver.
> You mean, if the usecase seems valid we should try to extend the framework
> mentioned by Rahul (https://lore.kernel.org/lkml/20240307072923.6cc8a2ba@kernel.org/)
> to include these stats as well?
> Will explore this a bit more and update. Thanks.
Following is the data we can share:
Default interrupts affinity for each queue:
25: 1 103 0 2989138 Hyper-V PCIe MSI 4138200989697-edge mana_q0@pci:7870:00:00.0
26: 0 1 4005360 0 Hyper-V PCIe MSI 4138200989698-edge mana_q1@pci:7870:00:00.0
27: 0 0 1 2997584 Hyper-V PCIe MSI 4138200989699-edge mana_q2@pci:7870:00:00.0
28: 3565461 0 0 1 Hyper-V PCIe MSI 4138200989700-edge mana_q3
@pci:7870:00:00.0
As seen the CPU-queue mapping is not 1:1, Queue 0 and Queue 2 are both mapped
to cpu3. From this knowledge we can figure out the total RX stats processed by
each CPU by adding the values of mana_q0 and mana_q2 stats for cpu3. But if
this data changes dynamically using irqbalance or smp_affinity file edits, the
above assumption fails.
Interrupt affinity for mana_q2 changes and the affinity table looks as follows
25: 1 103 0 3038084 Hyper-V PCIe MSI 4138200989697-edge mana_q0@pci:7870:00:00.0
26: 0 1 4012447 0 Hyper-V PCIe MSI 4138200989698-edge mana_q1@pci:7870:00:00.0
27: 157181 10 1 3007990 Hyper-V PCIe MSI 4138200989699-edge mana_q2@pci:7870:00:00.0
28: 3593858 0 0 1 Hyper-V PCIe MSI 4138200989700-edge mana_q3@pci:7870:00:00.0
And during this time we might end up calculating the per-CPU stats incorrectly,
messing up the understanding of CPU usage by MANA driver that is consumed by
monitoring services.
Also sharing the existing per-queue stats during this experiment, in case needed
Per-queue stats before changing CPU-affinities:
tx_cq_err: 0
tx_cqe_unknown_type: 0
rx_coalesced_err: 0
rx_cqe_unknown_type: 0
rx_0_packets: 4230152
rx_0_bytes: 289545167
rx_0_xdp_drop: 0
rx_0_xdp_tx: 0
rx_0_xdp_redirect: 0
rx_1_packets: 4113017
rx_1_bytes: 314552601
rx_1_xdp_drop: 0
rx_1_xdp_tx: 0
rx_1_xdp_redirect: 0
rx_2_packets: 4458906
rx_2_bytes: 305117506
rx_2_xdp_drop: 0
rx_2_xdp_tx: 0
rx_2_xdp_redirect: 0
rx_3_packets: 4619589
rx_3_bytes: 315445084
rx_3_xdp_drop: 0
rx_3_xdp_tx: 0
rx_3_xdp_redirect: 0
hc_tx_err_vport_disabled: 0
hc_tx_err_inval_vportoffset_pkt: 0
hc_tx_err_vlan_enforcement: 0
hc_tx_err_eth_type_enforcement: 0
hc_tx_err_sa_enforcement: 0
hc_tx_err_sqpdid_enforcement: 0
hc_tx_err_cqpdid_enforcement: 0
hc_tx_err_mtu_violation: 0
hc_tx_err_inval_oob: 0
hc_tx_err_gdma: 0
hc_tx_bytes: 126336708121
hc_tx_ucast_pkts: 86748013
hc_tx_ucast_bytes: 126336703775
hc_tx_bcast_pkts: 37
hc_tx_bcast_bytes: 2842
hc_tx_mcast_pkts: 7
hc_tx_mcast_bytes: 1504
tx_cq_err: 0
tx_cqe_unknown_type: 0
rx_coalesced_err: 0
rx_cqe_unknown_type: 0
rx_0_packets: 4230152
rx_0_bytes: 289545167
rx_0_xdp_drop: 0
rx_0_xdp_tx: 0
rx_0_xdp_redirect: 0
rx_1_packets: 4113017
rx_1_bytes: 314552601
rx_1_xdp_drop: 0
rx_1_xdp_tx: 0
rx_1_xdp_redirect: 0
rx_2_packets: 4458906
rx_2_bytes: 305117506
rx_2_xdp_drop: 0
rx_2_xdp_tx: 0
rx_2_xdp_redirect: 0
rx_3_packets: 4619589
rx_3_bytes: 315445084
rx_3_xdp_drop: 0
rx_3_xdp_tx: 0
rx_3_xdp_redirect: 0
tx_0_packets: 5995507
tx_0_bytes: 28749696408
tx_0_xdp_xmit: 0
tx_0_tso_packets: 4719840
tx_0_tso_bytes: 26873844525
tx_0_tso_inner_packets: 0
tx_0_tso_inner_bytes: 0
tx_0_long_pkt_fmt: 0
tx_0_short_pkt_fmt: 5995507
tx_0_csum_partial: 1275621
tx_0_mana_map_err: 0
tx_1_packets: 6653598
tx_1_bytes: 38318341475
tx_1_xdp_xmit: 0
tx_1_tso_packets: 5330921
tx_1_tso_bytes: 36210150488
tx_1_tso_inner_packets: 0
tx_1_tso_inner_bytes: 0
tx_1_long_pkt_fmt: 0
tx_1_short_pkt_fmt: 6653598
tx_1_csum_partial: 1322643
tx_1_mana_map_err: 0
tx_2_packets: 5715246
tx_2_bytes: 25662283686
tx_2_xdp_xmit: 0
tx_2_tso_packets: 4619118
tx_2_tso_bytes: 23829680267
tx_2_tso_inner_packets: 0
tx_2_tso_inner_bytes: 0
tx_2_long_pkt_fmt: 0
tx_2_short_pkt_fmt: 5715246
tx_2_csum_partial: 1096092
tx_2_mana_map_err: 0
tx_3_packets: 6175860
tx_3_bytes: 29500667904
tx_3_xdp_xmit: 0
tx_3_tso_packets: 4951591
tx_3_tso_bytes: 27446937448
tx_3_tso_inner_packets: 0
tx_3_tso_inner_bytes: 0
tx_3_long_pkt_fmt: 0
tx_3_short_pkt_fmt: 6175860
tx_3_csum_partial: 1224213
tx_3_mana_map_err: 0
Per-queue stats after changing CPU-affinities:
rx_0_packets: 4781895
rx_0_bytes: 326478061
rx_0_xdp_drop: 0
rx_0_xdp_tx: 0
rx_0_xdp_redirect: 0
rx_1_packets: 4116990
rx_1_bytes: 315439234
rx_1_xdp_drop: 0
rx_1_xdp_tx: 0
rx_1_xdp_redirect: 0
rx_2_packets: 4528800
rx_2_bytes: 310312337
rx_2_xdp_drop: 0
rx_2_xdp_tx: 0
rx_2_xdp_redirect: 0
rx_3_packets: 4622622
rx_3_bytes: 316282431
rx_3_xdp_drop: 0
rx_3_xdp_tx: 0
rx_3_xdp_redirect: 0
tx_0_packets: 5999379
tx_0_bytes: 28750864476
tx_0_xdp_xmit: 0
tx_0_tso_packets: 4720027
tx_0_tso_bytes: 26874344494
tx_0_tso_inner_packets: 0
tx_0_tso_inner_bytes: 0
tx_0_long_pkt_fmt: 0
tx_0_short_pkt_fmt: 5999379
tx_0_csum_partial: 1279296
tx_0_mana_map_err: 0
tx_1_packets: 6656913
tx_1_bytes: 38319355168
tx_1_xdp_xmit: 0
tx_1_tso_packets: 5331086
tx_1_tso_bytes: 36210592040
tx_1_tso_inner_packets: 0
tx_1_tso_inner_bytes: 0
tx_1_long_pkt_fmt: 0
tx_1_short_pkt_fmt: 6656913
tx_1_csum_partial: 1325785
tx_1_mana_map_err: 0
tx_2_packets: 5906172
tx_2_bytes: 36758032245
tx_2_xdp_xmit: 0
tx_2_tso_packets: 4806348
tx_2_tso_bytes: 34912213258
tx_2_tso_inner_packets: 0
tx_2_tso_inner_bytes: 0
tx_2_long_pkt_fmt: 0
tx_2_short_pkt_fmt: 5906172
tx_2_csum_partial: 1099782
tx_2_mana_map_err: 0
tx_3_packets: 6202399
tx_3_bytes: 30840325531
tx_3_xdp_xmit: 0
tx_3_tso_packets: 4973730
tx_3_tso_bytes: 28784371532
tx_3_tso_inner_packets: 0
tx_3_tso_inner_bytes: 0
tx_3_long_pkt_fmt: 0
tx_3_short_pkt_fmt: 6202399
tx_3_csum_partial: 1228603
tx_3_mana_map_err: 0
Powered by blists - more mailing lists