[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b10f0074-4003-2862-78ff-9b5dfcdc8566@huawei.com>
Date: Tue, 7 May 2024 09:23:37 +0800
From: shaozhengchao <shaozhengchao@...wei.com>
To: Zhu Yanjun <zyjzyj2000@...il.com>, <saeedm@...dia.com>,
<tariqt@...dia.com>, <borisp@...dia.com>, <shayd@...dia.com>,
<msanalla@...dia.com>, Rahul Rameshbabu <rrameshbabu@...dia.com>,
<weizhang@...dia.com>, <kliteyn@...dia.com>, <erezsh@...dia.com>,
<igozlan@...dia.com>
CC: netdev <netdev@...r.kernel.org>, <linux-rdma@...r.kernel.org>
Subject: Re: [question] when bonding with CX5 network card that support ROCE
On 2024/5/6 20:27, Zhu Yanjun wrote:
> On 06.05.24 13:33, shaozhengchao wrote:
>>
>> Hi Yanjun:
>> Thank you for your reply. Are there any other restrictions on using
>> ROCE on the CX5?
>
> https://docs.nvidia.com/networking/display/mlnxofedv571020
>
> The above link can answer all your questions ^_^
>
> Enjoy it.
>
> Zhu Yanjun
>
Thank you.
Zhengchao Shao
>>
>> Zhengchao Shao
>>
>> On 2024/5/6 18:58, Zhu Yanjun wrote:
>>>
>>> On 06.05.24 12:45, shaozhengchao wrote:
>>>> Hi yanjun:
>>>> The following is the command output after the cat /proc/net/bonding
>>>> /bond0 command is run:
>>>
>>> If I remember it correctly, it seems that it is a rdma LAG and
>>> bonding problem.
>>>
>>> Not sure if it is a known problem or not. Please contact your local
>>> support.
>>>
>>> Zhu Yanjun
>>>
>>>> [root@...alhost ~]# cat /proc/net/bonding/bond0
>>>> Ethernet Channel Bonding Driver: v5.10.0+
>>>>
>>>> Bonding Mode: IEEE 802.3ad Dynamic link aggregation
>>>> Transmit Hash Policy: layer2 (0)
>>>> MII Status: up
>>>> MII Polling Interval (ms): 100
>>>> Up Delay (ms): 0
>>>> Down Delay (ms): 0
>>>> Peer Notification Delay (ms): 0
>>>>
>>>> 802.3ad info
>>>> LACP rate: slow
>>>> Min links: 0
>>>> Aggregator selection policy (ad_select): stable
>>>> System priority: 65535
>>>> System MAC address: f4:1d:6b:6f:3b:97
>>>> Active Aggregator Info:
>>>> Aggregator ID: 2
>>>> Number of ports: 1
>>>> Actor Key: 23
>>>> Partner Key: 1
>>>> Partner Mac Address: 00:00:00:00:00:00
>>>>
>>>> Slave Interface: enp145s0f0
>>>> MII Status: up
>>>> Speed: 40000 Mbps
>>>> Duplex: full
>>>> Link Failure Count: 1
>>>> Permanent HW addr: f4:1d:6b:6f:3b:97
>>>> Slave queue ID: 0
>>>> Aggregator ID: 1
>>>> Actor Churn State: churned
>>>> Partner Churn State: churned
>>>> Actor Churned Count: 1
>>>> Partner Churned Count: 2
>>>> details actor lacp pdu:
>>>> system priority: 65535
>>>> system mac address: f4:1d:6b:6f:3b:97
>>>> port key: 23
>>>> port priority: 255
>>>> port number: 1
>>>> port state: 69
>>>> details partner lacp pdu:
>>>> system priority: 65535
>>>> system mac address: 00:00:00:00:00:00
>>>> oper key: 1
>>>> port priority: 255
>>>> port number: 1
>>>> port state: 1
>>>>
>>>> Slave Interface: enp145s0f1
>>>> MII Status: up
>>>> Speed: 40000 Mbps
>>>> Duplex: full
>>>> Link Failure Count: 0
>>>> Permanent HW addr: f4:1d:6b:6f:3b:98
>>>> Slave queue ID: 0
>>>> Aggregator ID: 2
>>>> Actor Churn State: none
>>>> Partner Churn State: churned
>>>> Actor Churned Count: 0
>>>> Partner Churned Count: 1
>>>> details actor lacp pdu:
>>>> system priority: 65535
>>>> system mac address: f4:1d:6b:6f:3b:97
>>>> port key: 23
>>>> port priority: 255
>>>> port number: 2
>>>> port state: 77
>>>> details partner lacp pdu:
>>>> system priority: 65535
>>>> system mac address: 00:00:00:00:00:00
>>>> oper key: 1
>>>> port priority: 255
>>>> port number: 1
>>>> port state: 1
>>>>
>>>> Thank you
>>>> Zhengchao Shao
>>>>
>>>>
>>>> On 2024/5/6 16:26, Zhu Yanjun wrote:
>>>>> On 06.05.24 06:46, shaozhengchao wrote:
>>>>>>
>>>>>> When using the 5.10 kernel, I can find two IB devices using the
>>>>>> ibv_devinfo command.
>>>>>> ----------------------------------
>>>>>> [root@...alhost ~]# lspci
>>>>>> 91:00.0 Ethernet controller: Mellanox Technologies MT27800 Family
>>>>>> [ConnectX-5]
>>>>>> 91:00.1 Ethernet controller: Mellanox Technologies MT27800 Family
>>>>>> ----------------------------------
>>>>>> [root@...alhost ~]# ibv_devinfo
>>>>>> hca_id: mlx5_0
>>>>>> transport: InfiniBand (0)
>>>>>> fw_ver: 16.31.1014
>>>>>> node_guid: f41d:6b03:006f:4743
>>>>>> sys_image_guid: f41d:6b03:006f:4743
>>>>>> vendor_id: 0x02c9
>>>>>> vendor_part_id: 4119
>>>>>> hw_ver: 0x0
>>>>>> board_id: HUA0000000004
>>>>>> phys_port_cnt: 1
>>>>>> port: 1
>>>>>> state: PORT_ACTIVE (4)
>>>>>> max_mtu: 4096 (5)
>>>>>> active_mtu: 1024 (3)
>>>>>> sm_lid: 0
>>>>>> port_lid: 0
>>>>>> port_lmc: 0x00
>>>>>> link_layer: Ethernet
>>>>>>
>>>>>> hca_id: mlx5_1
>>>>>> transport: InfiniBand (0)
>>>>>> fw_ver: 16.31.1014
>>>>>> node_guid: f41d:6b03:006f:4744
>>>>>> sys_image_guid: f41d:6b03:006f:4743
>>>>>> vendor_id: 0x02c9
>>>>>> vendor_part_id: 4119
>>>>>> hw_ver: 0x0
>>>>>> board_id: HUA0000000004
>>>>>> phys_port_cnt: 1
>>>>>> port: 1
>>>>>> state: PORT_ACTIVE (4)
>>>>>> max_mtu: 4096 (5)
>>>>>> active_mtu: 1024 (3)
>>>>>> sm_lid: 0
>>>>>> port_lid: 0
>>>>>> port_lmc: 0x00
>>>>>> link_layer: Ethernet
>>>>>> ----------------------------------
>>>>>> But after the two network ports are bonded, only one IB device is
>>>>>> available, and only PF0 can be used.
>>>>>> [root@...alhost shaozhengchao]# ibv_devinfo
>>>>>> hca_id: mlx5_bond_0
>>>>>> transport: InfiniBand (0)
>>>>>> fw_ver: 16.31.1014
>>>>>> node_guid: f41d:6b03:006f:4743
>>>>>> sys_image_guid: f41d:6b03:006f:4743
>>>>>> vendor_id: 0x02c9
>>>>>> vendor_part_id: 4119
>>>>>> hw_ver: 0x0
>>>>>> board_id: HUA0000000004
>>>>>> phys_port_cnt: 1
>>>>>> port: 1
>>>>>> state: PORT_ACTIVE (4)
>>>>>> max_mtu: 4096 (5)
>>>>>> active_mtu: 1024 (3)
>>>>>> sm_lid: 0
>>>>>> port_lid: 0
>>>>>> port_lmc: 0x00
>>>>>> link_layer: Ethernet
>>>>>>
>>>>>> The current Linux mainline driver is the same.
>>>>>>
>>>>>> I found the comment ("If bonded, we do not add an IB device for
>>>>>> PF1.")
>>>>>> in the mlx5_lag_intf_add function of the 5.10 branch driver code.
>>>>>
>>>>> Not sure if rdma lag is enabled for this or not. /proc/net/bonding
>>>>> will provide more more details normally.
>>>>>
>>>>> Zhu Yanjun
>>>>>
>>>>>> This indicates that wthe the same NIC is used, only PF0 support
>>>>>> bonding?
>>>>>> Are there any other constraints, when enable bonding with CX5?
>>>>>>
>>>>>> Thank you
>>>>>> Zhengchao Shao
>>>>>
>
Powered by blists - more mailing lists