[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9c25971b-78e9-956f-95a5-38e688240ef6@nvidia.com>
Date: Mon, 10 Jul 2023 18:57:24 +0300
From: Mark Bloch <mbloch@...dia.com>
To: Aleksander Trofimowicz <alex@....eu>, netdev@...r.kernel.org
Subject: Re: [bug] failed to enable eswitch SRIOV in
mlx5_device_enable_sriov()
On 10/07/2023 18:25, Aleksander Trofimowicz wrote:
> Hi,
>
> I've noticed a regression in the mlx5_core driver: defining VFs via
> /sys/bus/pci/devices/.../sriov_numvfs is no longer possible.
>
> Upon a write call the following error is returned:
>
>
> Jul 10 11:07:44 server kernel: mlx5_core 0000:c1:00.0: mlx5_cmd_out_err:803:(pid 1097): QUERY_HCA_CAP(0x100) op_mod(0x40) failed, status bad parameter(0x3), syndrome (0x5add95), err(-22)
> Jul 10 11:07:44 server kernel: mlx5_core 0000:c1:00.0: mlx5_device_enable_sriov:82:(pid 1097): failed to enable eswitch SRIOV (-22)
> Jul 10 11:07:44 server kernel: mlx5_core 0000:c1:00.0: mlx5_sriov_enable:168:(pid 1097): mlx5_device_enable_sriov failed : -22
>
Hi Aleksander,
This should fix the issue:
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c?id=6496357aa5f710eec96f91345b9da1b37c3231f6
Mark
>
> which could be traced back to a second call to mlx5_vport_get_other_func_cap() in
> mlx5_core_sriov_configure()->[...]->mlx5_esw_vport_enable()->[...]->mlx5_esw_vport_caps_get():
>
>
> 7678.225594 | 21) bash-1604 | | mlx5_esw_vport_enable [mlx5_core]() {
> 7678.225594 | 21) bash-1604 | | mlx5_eswitch_get_vport [mlx5_core]() {
> 7678.225594 | 21) bash-1604 | 0.300 us | __rcu_read_lock();
> 7678.225595 | 21) bash-1604 | 0.270 us | __rcu_read_unlock();
> 7678.225595 | 21) bash-1604 | 1.320 us | } /* mlx5_eswitch_get_vport [mlx5_core] */
> 7678.225595 | 21) bash-1604 | 0.260 us | mutex_lock();
> 7678.225596 | 21) bash-1604 | | esw_legacy_vport_acl_setup [mlx5_core]() {
> 7678.225596 | 21) bash-1604 | | esw_acl_ingress_lgcy_setup [mlx5_core]() {
> 7678.225597 | 21) bash-1604 | 0.290 us | esw_acl_ingress_allow_rule_destroy [mlx5_core]();
> 7678.225597 | 21) bash-1604 | 0.290 us | esw_acl_ingress_lgcy_cleanup [mlx5_core]();
> 7678.225598 | 21) bash-1604 | 1.720 us | } /* esw_acl_ingress_lgcy_setup [mlx5_core] */
> 7678.225598 | 21) bash-1604 | | esw_acl_egress_lgcy_setup [mlx5_core]() {
> 7678.225599 | 21) bash-1604 | | mlx5_fc_create [mlx5_core]() {
> 7678.225599 | 21) bash-1604 | | mlx5_fc_create_ex [mlx5_core]() {
> 7678.225600 | 21) bash-1604 | 0.760 us | kmalloc_trace();
> 7678.225600 | 21) bash-1604 | ! 500.430 us | mlx5_cmd_fc_alloc [mlx5_core]();
> 7678.226101 | 21) bash-1604 | ! 502.070 us | } /* mlx5_fc_create_ex [mlx5_core] */
> 7678.226101 | 21) bash-1604 | ! 502.650 us | } /* mlx5_fc_create [mlx5_core] */
> 7678.226102 | 21) bash-1604 | 0.410 us | esw_acl_egress_vlan_destroy [mlx5_core]();
> 7678.226102 | 21) bash-1604 | | esw_acl_egress_lgcy_cleanup [mlx5_core]() {
> 7678.226103 | 21) bash-1604 | | mlx5_fc_destroy [mlx5_core]() {
> 7678.226103 | 21) bash-1604 | ! 239.330 us | mlx5_fc_release [mlx5_core]();
> 7678.226343 | 21) bash-1604 | ! 240.080 us | } /* mlx5_fc_destroy [mlx5_core] */
> 7678.226343 | 21) bash-1604 | ! 240.680 us | } /* esw_acl_egress_lgcy_cleanup [mlx5_core] */
> 7678.226343 | 21) bash-1604 | ! 745.110 us | } /* esw_acl_egress_lgcy_setup [mlx5_core] */
> 7678.226344 | 21) bash-1604 | ! 747.640 us | } /* esw_legacy_vport_acl_setup [mlx5_core] */
> 7678.226344 | 21) bash-1604 | | kmalloc_trace() {
> 7678.226344 | 21) bash-1604 | | __kmem_cache_alloc_node() {
> 7678.226344 | 21) bash-1604 | 0.250 us | should_failslab();
> 7678.226345 | 21) bash-1604 | 1.080 us | } /* __kmem_cache_alloc_node */
> 7678.226345 | 21) bash-1604 | 1.570 us | } /* kmalloc_trace */
> 7678.226346 | 21) bash-1604 | | mlx5_vport_get_other_func_cap [mlx5_core]() {
> 7678.226346 | 21) bash-1604 | | mlx5_cmd_exec [mlx5_core]() {
> 7678.226346 | 21) bash-1604 | | mlx5_cmd_do [mlx5_core]() {
> 7678.226346 | 21) bash-1604 | | cmd_exec [mlx5_core]() {
> 7678.226347 | 21) bash-1604 | 0.750 us | mlx5_alloc_cmd_msg [mlx5_core]();
> 7678.226348 | 21) bash-1604 | 0.250 us | _raw_spin_lock();
> 7678.226348 | 21) bash-1604 | 0.260 us | _raw_spin_unlock();
> 7678.226349 | 21) bash-1604 | 7.290 us | mlx5_alloc_cmd_msg [mlx5_core]();
> 7678.226356 | 21) bash-1604 | 0.560 us | kmalloc_trace();
> 7678.226357 | 21) bash-1604 | 0.260 us | __init_swait_queue_head();
> 7678.226358 | 21) bash-1604 | 0.250 us | __init_swait_queue_head();
> 7678.226358 | 21) bash-1604 | 0.250 us | init_timer_key();
> 7678.226359 | 21) bash-1604 | 4.160 us | queue_work_on();
> 7678.226363 | 21) bash-1604 | 0.260 us | _mlx5_tout_ms [mlx5_core]();
> 7678.226363 | 21) bash-1604 | 0.250 us | __msecs_to_jiffies();
> 7678.226364 | 21) bash-1604 | + 36.880 us | wait_for_completion_timeout();
> 7678.226401 | 21) bash-1604 | # 1772.890 us | wait_for_completion_timeout();
> 7678.228175 | 21) bash-1604 | 0.440 us | _raw_spin_lock_irq();
> 7678.228175 | 21) bash-1604 | 0.330 us | _raw_spin_unlock_irq();
> 7678.228176 | 21) bash-1604 | 1.290 us | cmd_ent_put [mlx5_core]();
> 7678.228177 | 21) bash-1604 | 3.020 us | mlx5_copy_from_msg [mlx5_core]();
> 7678.228181 | 21) bash-1604 | 0.530 us | dma_pool_free();
> 7678.228182 | 21) bash-1604 | 0.380 us | kfree();
> 7678.228182 | 21) bash-1604 | 0.720 us | dma_pool_free();
> 7678.228183 | 21) bash-1604 | 0.360 us | kfree();
> 7678.228184 | 21) bash-1604 | 0.490 us | dma_pool_free();
> 7678.228185 | 21) bash-1604 | 0.370 us | kfree();
> 7678.228186 | 21) bash-1604 | 0.490 us | dma_pool_free();
> 7678.228186 | 21) bash-1604 | 0.360 us | kfree();
> 7678.228187 | 21) bash-1604 | 0.480 us | dma_pool_free();
> 7678.228188 | 21) bash-1604 | 0.370 us | kfree();
> 7678.228188 | 21) bash-1604 | 1.030 us | dma_pool_free();
> 7678.228190 | 21) bash-1604 | 0.370 us | kfree();
> 7678.228190 | 21) bash-1604 | 0.490 us | dma_pool_free();
> 7678.228191 | 21) bash-1604 | 0.360 us | kfree();
> 7678.228192 | 21) bash-1604 | 0.490 us | dma_pool_free();
> 7678.228193 | 21) bash-1604 | 0.370 us | kfree();
> 7678.228193 | 21) bash-1604 | 0.360 us | kfree();
> 7678.228194 | 21) bash-1604 | 0.710 us | free_msg [mlx5_core]();
> 7678.228195 | 21) bash-1604 | # 1848.590 us | } /* cmd_exec [mlx5_core] */
> 7678.228195 | 21) bash-1604 | | cmd_status_err [mlx5_core]() {
> 7678.228196 | 21) bash-1604 | 0.950 us | mlx5_command_str [mlx5_core]();
> 7678.228197 | 21) bash-1604 | 1.780 us | } /* cmd_status_err [mlx5_core] */
> 7678.228197 | 21) bash-1604 | # 1851.180 us | } /* mlx5_cmd_do [mlx5_core] */
> 7678.228198 | 21) bash-1604 | 0.260 us | mlx5_cmd_check [mlx5_core]();
> 7678.228198 | 21) bash-1604 | # 1852.150 us | } /* mlx5_cmd_exec [mlx5_core] */
> 7678.228198 | 21) bash-1604 | # 1852.620 us | } /* mlx5_vport_get_other_func_cap [mlx5_core] */
> 7678.228199 | 21) bash-1604 | | mlx5_vport_get_other_func_cap [mlx5_core]() {
> 7678.228199 | 21) bash-1604 | | mlx5_cmd_exec [mlx5_core]() {
> 7678.228199 | 21) bash-1604 | | mlx5_cmd_do [mlx5_core]() {
> 7678.228199 | 21) bash-1604 | | cmd_exec [mlx5_core]() {
> 7678.228200 | 21) bash-1604 | 0.710 us | mlx5_alloc_cmd_msg [mlx5_core]();
> 7678.228201 | 21) bash-1604 | 0.270 us | _raw_spin_lock();
> 7678.228201 | 21) bash-1604 | 0.260 us | _raw_spin_unlock();
> 7678.228202 | 21) bash-1604 | 7.160 us | mlx5_alloc_cmd_msg [mlx5_core]();
> 7678.228209 | 21) bash-1604 | 0.570 us | kmalloc_trace();
> 7678.228210 | 21) bash-1604 | 0.260 us | __init_swait_queue_head();
> 7678.228210 | 21) bash-1604 | 0.250 us | __init_swait_queue_head();
> 7678.228211 | 21) bash-1604 | 0.290 us | init_timer_key();
> 7678.228212 | 21) bash-1604 | 4.280 us | queue_work_on();
> 7678.228216 | 21) bash-1604 | 0.340 us | _mlx5_tout_ms [mlx5_core]();
> 7678.228217 | 21) bash-1604 | 0.260 us | __msecs_to_jiffies();
> 7678.228217 | 21) bash-1604 | + 37.490 us | wait_for_completion_timeout();
> 7678.228255 | 21) bash-1604 | ! 316.610 us | wait_for_completion_timeout();
> 7678.228572 | 21) bash-1604 | 0.280 us | _raw_spin_lock_irq();
> 7678.228573 | 21) bash-1604 | 0.250 us | _raw_spin_unlock_irq();
> 7678.228573 | 21) bash-1604 | 1.280 us | cmd_ent_put [mlx5_core]();
> 7678.228575 | 21) bash-1604 | 2.840 us | mlx5_copy_from_msg [mlx5_core]();
> 7678.228578 | 21) bash-1604 | 0.510 us | dma_pool_free();
> 7678.228579 | 21) bash-1604 | 0.380 us | kfree();
> 7678.228580 | 21) bash-1604 | 0.490 us | dma_pool_free();
> 7678.228580 | 21) bash-1604 | 0.370 us | kfree();
> 7678.228581 | 21) bash-1604 | 0.490 us | dma_pool_free();
> 7678.228581 | 21) bash-1604 | 0.380 us | kfree();
> 7678.228582 | 21) bash-1604 | 0.490 us | dma_pool_free();
> 7678.228583 | 21) bash-1604 | 0.370 us | kfree();
> 7678.228584 | 21) bash-1604 | 0.490 us | dma_pool_free();
> 7678.228584 | 21) bash-1604 | 0.410 us | kfree();
> 7678.228585 | 21) bash-1604 | 0.490 us | dma_pool_free();
> 7678.228586 | 21) bash-1604 | 0.380 us | kfree();
> 7678.228587 | 21) bash-1604 | 0.490 us | dma_pool_free();
> 7678.228587 | 21) bash-1604 | 0.370 us | kfree();
> 7678.228588 | 21) bash-1604 | 0.490 us | dma_pool_free();
> 7678.228589 | 21) bash-1604 | 0.370 us | kfree();
> 7678.228589 | 21) bash-1604 | 0.380 us | kfree();
> 7678.228590 | 21) bash-1604 | 0.640 us | free_msg [mlx5_core]();
> 7678.228591 | 21) bash-1604 | ! 391.590 us | } /* cmd_exec [mlx5_core] */
> 7678.228591 | 21) bash-1604 | | cmd_status_err [mlx5_core]() {
> 7678.228592 | 21) bash-1604 | 0.270 us | mlx5_command_str [mlx5_core]();
> 7678.228592 | 21) bash-1604 | 0.450 us | _raw_spin_lock_irq();
> 7678.228593 | 21) bash-1604 | 0.310 us | _raw_spin_unlock_irq();
> 7678.228593 | 21) bash-1604 | 2.030 us | } /* cmd_status_err [mlx5_core] */
> 7678.228593 | 21) bash-1604 | ! 394.300 us | } /* mlx5_cmd_do [mlx5_core] */
> 7678.228594 | 21) bash-1604 | | mlx5_cmd_check [mlx5_core]() {
> 7678.228594 | 21) bash-1604 | | mlx5_cmd_out_err [mlx5_core]() {
> 7678.228595 | 21) bash-1604 | 0.260 us | _raw_spin_trylock();
> 7678.228595 | 21) bash-1604 | 0.280 us | _raw_spin_unlock_irqrestore();
> 7678.228596 | 21) bash-1604 | 0.260 us | mlx5_command_str [mlx5_core]();
> 7678.228596 | 21) bash-1604 | ! 158.330 us | _dev_err();
> 7678.228755 | 21) bash-1604 | ! 160.760 us | } /* mlx5_cmd_out_err [mlx5_core] */
> 7678.228755 | 21) bash-1604 | ! 161.340 us | } /* mlx5_cmd_check [mlx5_core] */
> 7678.228755 | 21) bash-1604 | ! 556.330 us | } /* mlx5_cmd_exec [mlx5_core] */
> 7678.228755 | 21) bash-1604 | ! 556.810 us | } /* mlx5_vport_get_other_func_cap [mlx5_core] */
> 7678.228756 | 21) bash-1604 | | kfree() {
> 7678.228756 | 21) bash-1604 | 0.350 us | __kmem_cache_free();
> 7678.228757 | 21) bash-1604 | 0.860 us | } /* kfree */
> 7678.228757 | 21) bash-1604 | | esw_legacy_vport_acl_cleanup [mlx5_core]() {
> 7678.228757 | 21) bash-1604 | 0.310 us | esw_acl_egress_lgcy_cleanup [mlx5_core]();
> 7678.228758 | 21) bash-1604 | 0.350 us | esw_acl_ingress_lgcy_cleanup [mlx5_core]();
> 7678.228758 | 21) bash-1604 | 1.370 us | } /* esw_legacy_vport_acl_cleanup [mlx5_core] */
> 7678.228758 | 21) bash-1604 | 0.250 us | mutex_unlock();
> 7678.228759 | 21) bash-1604 | # 3165.340 us | } /* mlx5_esw_vport_enable [mlx5_core] */
>
>
> This second call was introduced in [0].
>
> Device in use: MCX416A-CCAT.
>
> [0] https://lore.kernel.org/netdev/20221206185119.380138-9-shayd@nvidia.com/
> --
> Kind regards,
> Aleksander Trofimowicz
>
Powered by blists - more mailing lists