netdev - Re: [PATCH net] ice: Fix VF Reset paths when interface in a failed over aggregate

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZVYllBDzdLIB97e2@boxer>
Date: Thu, 16 Nov 2023 15:22:12 +0100
From: Maciej Fijalkowski <maciej.fijalkowski@...el.com>
To: Tony Nguyen <anthony.l.nguyen@...el.com>
CC: <davem@...emloft.net>, <kuba@...nel.org>, <pabeni@...hat.com>,
	<edumazet@...gle.com>, <netdev@...r.kernel.org>, Dave Ertman
	<david.m.ertman@...el.com>, <carolyn.wyborny@...el.com>,
	<daniel.machon@...rochip.com>, Przemek Kitszel
	<przemyslaw.kitszel@...el.com>, Sujai Buvaneswaran
	<sujai.buvaneswaran@...el.com>
Subject: Re: [PATCH net] ice: Fix VF Reset paths when interface in a failed
 over aggregate

On Wed, Nov 15, 2023 at 01:12:41PM -0800, Tony Nguyen wrote:
> From: Dave Ertman <david.m.ertman@...el.com>
> 
> There is an error when an interface has the following conditions:
> - PF is in an aggregate (bond)
> - PF has VFs created on it
> - bond is in a state where it is failed-over to the secondary interface
> - A VF reset is issued on one or more of those VFs
> 
> The issue is generated by the originating PF trying to rebuild or
> reconfigure the VF resources.  Since the bond is failed over to the
> secondary interface the queue contexts are in a modified state.
> 
> To fix this issue, have the originating interface reclaim its resources
> prior to the tear-down and rebuild or reconfigure.  Then after the process
> is complete, move the resources back to the currently active interface.
> 
> There are multiple paths that can be used depending on what triggered the
> event, so create a helper function to move the queues and use paired calls
> to the helper (back to origin, process, then move back to active interface)
> under the same lag_mutex lock.
> 
> Fixes: 1e0f9881ef79 ("ice: Flesh out implementation of support for SRIOV on bonded interface")
> Signed-off-by: Dave Ertman <david.m.ertman@...el.com>
> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@...el.com>
> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@...el.com>
> Signed-off-by: Tony Nguyen <anthony.l.nguyen@...el.com>
> ---
> This is the net patch mentioned yesterday:
> https://lore.kernel.org/netdev/71058999-50d9-cc17-d940-3f043734e0ee@intel.com/
> 
>  drivers/net/ethernet/intel/ice/ice_lag.c      | 42 +++++++++++++++++++
>  drivers/net/ethernet/intel/ice/ice_lag.h      |  1 +
>  drivers/net/ethernet/intel/ice/ice_vf_lib.c   | 20 +++++++++
>  drivers/net/ethernet/intel/ice/ice_virtchnl.c | 25 +++++++++++
>  4 files changed, 88 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c b/drivers/net/ethernet/intel/ice/ice_lag.c
> index cd065ec48c87..9eed93baa59b 100644
> --- a/drivers/net/ethernet/intel/ice/ice_lag.c
> +++ b/drivers/net/ethernet/intel/ice/ice_lag.c
> @@ -679,6 +679,48 @@ static void ice_lag_move_vf_nodes(struct ice_lag *lag, u8 oldport, u8 newport)
>  			ice_lag_move_single_vf_nodes(lag, oldport, newport, i);
>  }
>  
> +/**
> + * ice_lag_move_vf_nodes_cfg - move VF nodes outside LAG netdev event context
> + * @lag: local lag struct
> + * @src_prt: lport value for source port
> + * @dst_prt: lport value for destination port
> + *
> + * This function is used to move nodes during an out-of-netdev-event situation,
> + * primarily when the driver needs to reconfigure or recreate resources.
> + *
> + * Must be called while holding the lag_mutex to avoid lag events from
> + * processing while out-of-sync moves are happening.  Also, paired moves,
> + * such as used in a reset flow, should both be called under the same mutex
> + * lock to avoid changes between start of reset and end of reset.
> + */
> +void ice_lag_move_vf_nodes_cfg(struct ice_lag *lag, u8 src_prt, u8 dst_prt)
> +{
> +	struct ice_lag_netdev_list ndlist, *nl;
> +	struct list_head *tmp, *n;
> +	struct net_device *tmp_nd;
> +
> +	INIT_LIST_HEAD(&ndlist.node);
> +	rcu_read_lock();
> +	for_each_netdev_in_bond_rcu(lag->upper_netdev, tmp_nd) {

Why do you need rcu section for that?

under mutex? lacking context here.

> +		nl = kzalloc(sizeof(*nl), GFP_ATOMIC);

do these have to be new allocations or could you just use list_move?

> +		if (!nl)
> +			break;
> +
> +		nl->netdev = tmp_nd;
> +		list_add(&nl->node, &ndlist.node);

list_add_rcu ?

> +	}
> +	rcu_read_unlock();

you have the very same chunk of code in ice_lag_move_new_vf_nodes(). pull
this out to common function?

...and in ice_lag_rebuild().

> +	lag->netdev_head = &ndlist.node;
> +	ice_lag_move_vf_nodes(lag, src_prt, dst_prt);
> +
> +	list_for_each_safe(tmp, n, &ndlist.node) {

use list_for_each_entry_safe()

> +		nl = list_entry(tmp, struct ice_lag_netdev_list, node);
> +		list_del(&nl->node);
> +		kfree(nl);
> +	}
> +	lag->netdev_head = NULL;
> +}
> +
>  #define ICE_LAG_SRIOV_CP_RECIPE		10
>  #define ICE_LAG_SRIOV_TRAIN_PKT_LEN	16
>  
> diff --git a/drivers/net/ethernet/intel/ice/ice_lag.h b/drivers/net/ethernet/intel/ice/ice_lag.h
> index 9557e8605a07..ede833dfa658 100644
> --- a/drivers/net/ethernet/intel/ice/ice_lag.h
> +++ b/drivers/net/ethernet/intel/ice/ice_lag.h
> @@ -65,4 +65,5 @@ int ice_init_lag(struct ice_pf *pf);
>  void ice_deinit_lag(struct ice_pf *pf);
>  void ice_lag_rebuild(struct ice_pf *pf);
>  bool ice_lag_is_switchdev_running(struct ice_pf *pf);
> +void ice_lag_move_vf_nodes_cfg(struct ice_lag *lag, u8 src_prt, u8 dst_prt);
>  #endif /* _ICE_LAG_H_ */
> diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.c b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> index aca1f2ea5034..b7ae09952156 100644
> --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> @@ -829,12 +829,16 @@ static void ice_notify_vf_reset(struct ice_vf *vf)
>  int ice_reset_vf(struct ice_vf *vf, u32 flags)
>  {
>  	struct ice_pf *pf = vf->pf;
> +	struct ice_lag *lag;
>  	struct ice_vsi *vsi;
> +	u8 act_prt, pri_prt;
>  	struct device *dev;
>  	int err = 0;
>  	bool rsd;
>  
>  	dev = ice_pf_to_dev(pf);
> +	act_prt = ICE_LAG_INVALID_PORT;
> +	pri_prt = pf->hw.port_info->lport;
>  
>  	if (flags & ICE_VF_RESET_NOTIFY)
>  		ice_notify_vf_reset(vf);
> @@ -845,6 +849,17 @@ int ice_reset_vf(struct ice_vf *vf, u32 flags)
>  		return 0;
>  	}
>  
> +	lag = pf->lag;
> +	mutex_lock(&pf->lag_mutex);
> +	if (lag && lag->bonded && lag->primary) {
> +		act_prt = lag->active_port;
> +		if (act_prt != pri_prt && act_prt != ICE_LAG_INVALID_PORT &&
> +		    lag->upper_netdev)
> +			ice_lag_move_vf_nodes_cfg(lag, act_prt, pri_prt);
> +		else
> +			act_prt = ICE_LAG_INVALID_PORT;
> +	}
> +
>  	if (flags & ICE_VF_RESET_LOCK)
>  		mutex_lock(&vf->cfg_lock);
>  	else
> @@ -937,6 +952,11 @@ int ice_reset_vf(struct ice_vf *vf, u32 flags)
>  	if (flags & ICE_VF_RESET_LOCK)
>  		mutex_unlock(&vf->cfg_lock);
>  
> +	if (lag && lag->bonded && lag->primary &&
> +	    act_prt != ICE_LAG_INVALID_PORT)
> +		ice_lag_move_vf_nodes_cfg(lag, pri_prt, act_prt);
> +	mutex_unlock(&pf->lag_mutex);
> +
>  	return err;
>  }
>  
> diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.c b/drivers/net/ethernet/intel/ice/ice_virtchnl.c
> index cdf17b1e2f25..de11b3186bd7 100644
> --- a/drivers/net/ethernet/intel/ice/ice_virtchnl.c
> +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.c
> @@ -1603,9 +1603,24 @@ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg)
>  	    (struct virtchnl_vsi_queue_config_info *)msg;
>  	struct virtchnl_queue_pair_info *qpi;
>  	struct ice_pf *pf = vf->pf;
> +	struct ice_lag *lag;
>  	struct ice_vsi *vsi;
> +	u8 act_prt, pri_prt;
>  	int i = -1, q_idx;
>  
> +	lag = pf->lag;
> +	mutex_lock(&pf->lag_mutex);
> +	act_prt = ICE_LAG_INVALID_PORT;
> +	pri_prt = pf->hw.port_info->lport;
> +	if (lag && lag->bonded && lag->primary) {
> +		act_prt = lag->active_port;
> +		if (act_prt != pri_prt && act_prt != ICE_LAG_INVALID_PORT &&
> +		    lag->upper_netdev)
> +			ice_lag_move_vf_nodes_cfg(lag, act_prt, pri_prt);
> +		else
> +			act_prt = ICE_LAG_INVALID_PORT;
> +	}
> +
>  	if (!test_bit(ICE_VF_STATE_ACTIVE, vf->vf_states))
>  		goto error_param;
>  
> @@ -1729,6 +1744,11 @@ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg)
>  		}
>  	}
>  
> +	if (lag && lag->bonded && lag->primary &&
> +	    act_prt != ICE_LAG_INVALID_PORT)
> +		ice_lag_move_vf_nodes_cfg(lag, pri_prt, act_prt);
> +	mutex_unlock(&pf->lag_mutex);
> +
>  	/* send the response to the VF */
>  	return ice_vc_send_msg_to_vf(vf, VIRTCHNL_OP_CONFIG_VSI_QUEUES,
>  				     VIRTCHNL_STATUS_SUCCESS, NULL, 0);
> @@ -1743,6 +1763,11 @@ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg)
>  				vf->vf_id, i);
>  	}
>  
> +	if (lag && lag->bonded && lag->primary &&
> +	    act_prt != ICE_LAG_INVALID_PORT)
> +		ice_lag_move_vf_nodes_cfg(lag, pri_prt, act_prt);
> +	mutex_unlock(&pf->lag_mutex);
> +
>  	ice_lag_move_new_vf_nodes(vf);
>  
>  	/* send the response to the VF */
> -- 
> 2.41.0
> 
>