[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <637f5bf73c395b49d1b615026d708d20@imap.linux.ibm.com>
Date: Sat, 22 Jan 2022 16:22:10 -0800
From: Dany Madden <drt@...ux.ibm.com>
To: Sukadev Bhattiprolu <sukadev@...ux.ibm.com>
Cc: netdev@...r.kernel.org, Brian King <brking@...ux.ibm.com>,
Rick Lindsley <ricklind@...ux.ibm.com>
Subject: Re: [PATCH net 1/4] ibmvnic: Allow extra failures before disabling
On 2022-01-21 18:59, Sukadev Bhattiprolu wrote:
> If auto-priority-failover (APF) is enabled and there are at least two
> backing devices of different priorities, some resets like fail-over,
> change-param etc can cause at least two back to back failovers.
> (Failover
> from high priority backing device to lower priority one and then back
> to the higher priority one if that is still functional).
>
> Depending on the timimg of the two failovers it is possible to trigger
> a "hard" reset and for the hard reset to fail due to failovers. When
> this
> occurs, the driver assumes that the network is unstable and disables
> the
> VNIC for a 60-second "settling time". This in turn can cause the
> ethtool
> command to fail with "No such device" while the vnic automatically
> recovers
> a little while later.
>
> Given that it's possible to have two back to back failures, allow for
> extra
> failures before disabling the vnic for the settling time.
>
> Fixes: f15fde9d47b8 ("ibmvnic: delay next reset if hard reset fails")
> Signed-off-by: Sukadev Bhattiprolu <sukadev@...ux.ibm.com>
Reviewed-by: Dany Madden <drt@...ux.ibm.com>
> ---
> drivers/net/ethernet/ibm/ibmvnic.c | 21 +++++++++++++++++----
> 1 file changed, 17 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c
> b/drivers/net/ethernet/ibm/ibmvnic.c
> index 0bb3911dd014..9b2d16ad76f1 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -2598,6 +2598,7 @@ static void __ibmvnic_reset(struct work_struct
> *work)
> struct ibmvnic_rwi *rwi;
> unsigned long flags;
> u32 reset_state;
> + int num_fails = 0;
> int rc = 0;
>
> adapter = container_of(work, struct ibmvnic_adapter, ibmvnic_reset);
> @@ -2651,11 +2652,23 @@ static void __ibmvnic_reset(struct work_struct
> *work)
> rc = do_hard_reset(adapter, rwi, reset_state);
> rtnl_unlock();
> }
> - if (rc) {
> - /* give backing device time to settle down */
> + if (rc)
> + num_fails++;
> + else
> + num_fails = 0;
> +
> + /* If auto-priority-failover is enabled we can get
> + * back to back failovers during resets, resulting
> + * in at least two failed resets (from high-priority
> + * backing device to low-priority one and then back)
> + * If resets continue to fail beyond that, give the
> + * adapter some time to settle down before retrying.
> + */
> + if (num_fails >= 3) {
> netdev_dbg(adapter->netdev,
> - "[S:%s] Hard reset failed, waiting 60 secs\n",
> - adapter_state_to_string(adapter->state));
> + "[S:%s] Hard reset failed %d times, waiting 60 secs\n",
> + adapter_state_to_string(adapter->state),
> + num_fails);
> set_current_state(TASK_UNINTERRUPTIBLE);
> schedule_timeout(60 * HZ);
> }
Powered by blists - more mailing lists