[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6a285586-0a11-43be-a07a-5ba0b92d0ee6@gmail.com>
Date: Mon, 6 May 2024 12:58:13 +0200
From: Zhu Yanjun <zyjzyj2000@...il.com>
To: shaozhengchao <shaozhengchao@...wei.com>, saeedm@...dia.com,
tariqt@...dia.com, borisp@...dia.com, shayd@...dia.com, msanalla@...dia.com,
Rahul Rameshbabu <rrameshbabu@...dia.com>, weizhang@...dia.com,
kliteyn@...dia.com, erezsh@...dia.com, igozlan@...dia.com
Cc: netdev <netdev@...r.kernel.org>, linux-rdma@...r.kernel.org
Subject: Re: [question] when bonding with CX5 network card that support ROCE
On 06.05.24 12:45, shaozhengchao wrote:
> Hi yanjun:
> The following is the command output after the cat /proc/net/bonding
> /bond0 command is run:
If I remember it correctly, it seems that it is a rdma LAG and bonding
problem.
Not sure if it is a known problem or not. Please contact your local support.
Zhu Yanjun
> [root@...alhost ~]# cat /proc/net/bonding/bond0
> Ethernet Channel Bonding Driver: v5.10.0+
>
> Bonding Mode: IEEE 802.3ad Dynamic link aggregation
> Transmit Hash Policy: layer2 (0)
> MII Status: up
> MII Polling Interval (ms): 100
> Up Delay (ms): 0
> Down Delay (ms): 0
> Peer Notification Delay (ms): 0
>
> 802.3ad info
> LACP rate: slow
> Min links: 0
> Aggregator selection policy (ad_select): stable
> System priority: 65535
> System MAC address: f4:1d:6b:6f:3b:97
> Active Aggregator Info:
> Aggregator ID: 2
> Number of ports: 1
> Actor Key: 23
> Partner Key: 1
> Partner Mac Address: 00:00:00:00:00:00
>
> Slave Interface: enp145s0f0
> MII Status: up
> Speed: 40000 Mbps
> Duplex: full
> Link Failure Count: 1
> Permanent HW addr: f4:1d:6b:6f:3b:97
> Slave queue ID: 0
> Aggregator ID: 1
> Actor Churn State: churned
> Partner Churn State: churned
> Actor Churned Count: 1
> Partner Churned Count: 2
> details actor lacp pdu:
> system priority: 65535
> system mac address: f4:1d:6b:6f:3b:97
> port key: 23
> port priority: 255
> port number: 1
> port state: 69
> details partner lacp pdu:
> system priority: 65535
> system mac address: 00:00:00:00:00:00
> oper key: 1
> port priority: 255
> port number: 1
> port state: 1
>
> Slave Interface: enp145s0f1
> MII Status: up
> Speed: 40000 Mbps
> Duplex: full
> Link Failure Count: 0
> Permanent HW addr: f4:1d:6b:6f:3b:98
> Slave queue ID: 0
> Aggregator ID: 2
> Actor Churn State: none
> Partner Churn State: churned
> Actor Churned Count: 0
> Partner Churned Count: 1
> details actor lacp pdu:
> system priority: 65535
> system mac address: f4:1d:6b:6f:3b:97
> port key: 23
> port priority: 255
> port number: 2
> port state: 77
> details partner lacp pdu:
> system priority: 65535
> system mac address: 00:00:00:00:00:00
> oper key: 1
> port priority: 255
> port number: 1
> port state: 1
>
> Thank you
> Zhengchao Shao
>
>
> On 2024/5/6 16:26, Zhu Yanjun wrote:
>> On 06.05.24 06:46, shaozhengchao wrote:
>>>
>>> When using the 5.10 kernel, I can find two IB devices using the
>>> ibv_devinfo command.
>>> ----------------------------------
>>> [root@...alhost ~]# lspci
>>> 91:00.0 Ethernet controller: Mellanox Technologies MT27800 Family
>>> [ConnectX-5]
>>> 91:00.1 Ethernet controller: Mellanox Technologies MT27800 Family
>>> ----------------------------------
>>> [root@...alhost ~]# ibv_devinfo
>>> hca_id: mlx5_0
>>> transport: InfiniBand (0)
>>> fw_ver: 16.31.1014
>>> node_guid: f41d:6b03:006f:4743
>>> sys_image_guid: f41d:6b03:006f:4743
>>> vendor_id: 0x02c9
>>> vendor_part_id: 4119
>>> hw_ver: 0x0
>>> board_id: HUA0000000004
>>> phys_port_cnt: 1
>>> port: 1
>>> state: PORT_ACTIVE (4)
>>> max_mtu: 4096 (5)
>>> active_mtu: 1024 (3)
>>> sm_lid: 0
>>> port_lid: 0
>>> port_lmc: 0x00
>>> link_layer: Ethernet
>>>
>>> hca_id: mlx5_1
>>> transport: InfiniBand (0)
>>> fw_ver: 16.31.1014
>>> node_guid: f41d:6b03:006f:4744
>>> sys_image_guid: f41d:6b03:006f:4743
>>> vendor_id: 0x02c9
>>> vendor_part_id: 4119
>>> hw_ver: 0x0
>>> board_id: HUA0000000004
>>> phys_port_cnt: 1
>>> port: 1
>>> state: PORT_ACTIVE (4)
>>> max_mtu: 4096 (5)
>>> active_mtu: 1024 (3)
>>> sm_lid: 0
>>> port_lid: 0
>>> port_lmc: 0x00
>>> link_layer: Ethernet
>>> ----------------------------------
>>> But after the two network ports are bonded, only one IB device is
>>> available, and only PF0 can be used.
>>> [root@...alhost shaozhengchao]# ibv_devinfo
>>> hca_id: mlx5_bond_0
>>> transport: InfiniBand (0)
>>> fw_ver: 16.31.1014
>>> node_guid: f41d:6b03:006f:4743
>>> sys_image_guid: f41d:6b03:006f:4743
>>> vendor_id: 0x02c9
>>> vendor_part_id: 4119
>>> hw_ver: 0x0
>>> board_id: HUA0000000004
>>> phys_port_cnt: 1
>>> port: 1
>>> state: PORT_ACTIVE (4)
>>> max_mtu: 4096 (5)
>>> active_mtu: 1024 (3)
>>> sm_lid: 0
>>> port_lid: 0
>>> port_lmc: 0x00
>>> link_layer: Ethernet
>>>
>>> The current Linux mainline driver is the same.
>>>
>>> I found the comment ("If bonded, we do not add an IB device for PF1.")
>>> in the mlx5_lag_intf_add function of the 5.10 branch driver code.
>>
>> Not sure if rdma lag is enabled for this or not. /proc/net/bonding
>> will provide more more details normally.
>>
>> Zhu Yanjun
>>
>>> This indicates that wthe the same NIC is used, only PF0 support
>>> bonding?
>>> Are there any other constraints, when enable bonding with CX5?
>>>
>>> Thank you
>>> Zhengchao Shao
>>
Powered by blists - more mailing lists