[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7202f51714ce5a1ce334f5078b2374f3@imap.linux.ibm.com>
Date: Wed, 07 Apr 2021 12:03:15 -0700
From: Dany Madden <drt@...ux.ibm.com>
To: Lijun Pan <ljp@...ux.vnet.ibm.com>
Cc: netdev@...r.kernel.org, David Miller <davem@...emloft.net>,
Rick Lindsley <ricklind@...ux.ibm.com>,
Sukadev Bhattiprolu <sukadev@...ux.ibm.com>
Subject: Re: [PATCH] ibmvnic: Continue with reset if set link down failed
On 2021-04-05 23:46, Lijun Pan wrote:
>> On Apr 5, 2021, at 10:47 PM, Dany Madden <drt@...ux.ibm.com> wrote:
>>
>> When an adapter is going thru a reset, it maybe in an unstable state
>> that
>> makes a request to set link down fail. In such a case, the adapter
>> needs
>> to continue on with reset to bring itself back to a stable state.
>>
>> Fixes: ed651a10875f ("ibmvnic: Updated reset handling")
>> Signed-off-by: Dany Madden <drt@...ux.ibm.com>
>> ---
>> drivers/net/ethernet/ibm/ibmvnic.c | 6 ++++--
>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
>> b/drivers/net/ethernet/ibm/ibmvnic.c
>> index 9c6438d3b3a5..e4f01a7099a0 100644
>> --- a/drivers/net/ethernet/ibm/ibmvnic.c
>> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
>> @@ -1976,8 +1976,10 @@ static int do_reset(struct ibmvnic_adapter
>> *adapter,
>> rtnl_unlock();
>> rc = set_link_state(adapter, IBMVNIC_LOGICAL_LNK_DN);
>> rtnl_lock();
>> - if (rc)
>> - goto out;
>> + if (rc) {
>> + netdev_dbg(netdev,
>> + "Setting link down failed rc=%d. Continue anyway\n", rc);
>> + }
>
> What’s the point of checking the return code if it can be neglected
> anyway?
> If we really don’t care if set_link_state succeeds or not, we don’t
> even need to call
> set_link_state() here.
> It seems more correct to me that we find out why set_link_state fails
> and fix it from that end.
We know why set link state failed. CRQ is no longer active at this
point. It is not possible to send a link down request to the VIOS. If
driver exits here, adapter will be left in an inoperable state. If it
continues to reinitialize the crq, it can continue to reset and come up.
Prior to submitting this patch, we ran a 17-hour and a 24-hour tests
(LPM+failover) on 10 vnics. We saw that:
17 hours, hit 4 times
- 3 times driver is able to continue on to re-init CRQ and continue on
to bring the adapter up.
- 1 time driver failed to re-init CRQ due to the last reset failed and
released the CRQ. Subsequent hard reset from a transport event
(failover) succeeded.
24 hours, hit 10 times
- 7 times driver is able to continue on to re-init CRQ and continue to
bring the adapter up.
- 3 times driver failed to init CRQ due to the last reset failed and
released the CRQ. Subsequent hard reset from a transport event (failover
or lpm) succeed.
In both runs, with the patch, 10 vnics continue to work as expected.
>
> Lijun
>
>>
>> if (adapter->state == VNIC_OPEN) {
>> /* When we dropped rtnl, ibmvnic_open() got
>> --
>> 2.26.2
>>
Powered by blists - more mailing lists