netdev - Re: [PATCH net] ice: Fix VF Reset paths when interface in a failed over aggregate

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZVema0m2Pw6+VYTF@boxer>
Date: Fri, 17 Nov 2023 18:44:11 +0100
From: Maciej Fijalkowski <maciej.fijalkowski@...el.com>
To: "Ertman, David M" <david.m.ertman@...el.com>
CC: "Nguyen, Anthony L" <anthony.l.nguyen@...el.com>, "davem@...emloft.net"
	<davem@...emloft.net>, "kuba@...nel.org" <kuba@...nel.org>,
	"pabeni@...hat.com" <pabeni@...hat.com>, "edumazet@...gle.com"
	<edumazet@...gle.com>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"Wyborny, Carolyn" <carolyn.wyborny@...el.com>, "daniel.machon@...rochip.com"
	<daniel.machon@...rochip.com>, "Kitszel, Przemyslaw"
	<przemyslaw.kitszel@...el.com>, "Buvaneswaran, Sujai"
	<sujai.buvaneswaran@...el.com>
Subject: Re: [PATCH net] ice: Fix VF Reset paths when interface in a failed
 over aggregate

On Thu, Nov 16, 2023 at 10:36:37PM +0100, Ertman, David M wrote:
> > -----Original Message-----
> > From: Fijalkowski, Maciej <maciej.fijalkowski@...el.com>
> > Sent: Thursday, November 16, 2023 6:22 AM
> > To: Nguyen, Anthony L <anthony.l.nguyen@...el.com>
> > Cc: davem@...emloft.net; kuba@...nel.org; pabeni@...hat.com;
> > edumazet@...gle.com; netdev@...r.kernel.org; Ertman, David M
> > <david.m.ertman@...el.com>; Wyborny, Carolyn
> > <carolyn.wyborny@...el.com>; daniel.machon@...rochip.com; Kitszel,
> > Przemyslaw <przemyslaw.kitszel@...el.com>; Buvaneswaran, Sujai
> > <sujai.buvaneswaran@...el.com>
> > Subject: Re: [PATCH net] ice: Fix VF Reset paths when interface in a failed
> > over aggregate
> > 
> > On Wed, Nov 15, 2023 at 01:12:41PM -0800, Tony Nguyen wrote:
> > > From: Dave Ertman <david.m.ertman@...el.com>
> > >
> > > There is an error when an interface has the following conditions:
> > > - PF is in an aggregate (bond)
> > > - PF has VFs created on it
> > > - bond is in a state where it is failed-over to the secondary interface
> > > - A VF reset is issued on one or more of those VFs
> > >
> > > The issue is generated by the originating PF trying to rebuild or
> > > reconfigure the VF resources.  Since the bond is failed over to the
> > > secondary interface the queue contexts are in a modified state.
> > >
> > > To fix this issue, have the originating interface reclaim its resources
> > > prior to the tear-down and rebuild or reconfigure.  Then after the process
> > > is complete, move the resources back to the currently active interface.
> > >
> > > There are multiple paths that can be used depending on what triggered the
> > > event, so create a helper function to move the queues and use paired calls
> > > to the helper (back to origin, process, then move back to active interface)
> > > under the same lag_mutex lock.
> > >
> > > Fixes: 1e0f9881ef79 ("ice: Flesh out implementation of support for SRIOV
> > on bonded interface")
> > > Signed-off-by: Dave Ertman <david.m.ertman@...el.com>
> > > Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@...el.com>
> > > Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@...el.com>
> > > Signed-off-by: Tony Nguyen <anthony.l.nguyen@...el.com>
> > > ---
> > > This is the net patch mentioned yesterday:
> > > https://lore.kernel.org/netdev/71058999-50d9-cc17-d940-
> > 3f043734e0ee@...el.com/
> > >
> > >  drivers/net/ethernet/intel/ice/ice_lag.c      | 42 +++++++++++++++++++
> > >  drivers/net/ethernet/intel/ice/ice_lag.h      |  1 +
> > >  drivers/net/ethernet/intel/ice/ice_vf_lib.c   | 20 +++++++++
> > >  drivers/net/ethernet/intel/ice/ice_virtchnl.c | 25 +++++++++++
> > >  4 files changed, 88 insertions(+)
> > >
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_lag.c
> > b/drivers/net/ethernet/intel/ice/ice_lag.c
> > > index cd065ec48c87..9eed93baa59b 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_lag.c
> > > +++ b/drivers/net/ethernet/intel/ice/ice_lag.c
> > > @@ -679,6 +679,48 @@ static void ice_lag_move_vf_nodes(struct ice_lag
> > *lag, u8 oldport, u8 newport)
> > >  			ice_lag_move_single_vf_nodes(lag, oldport,
> > newport, i);
> > >  }
> > >
> > > +/**
> > > + * ice_lag_move_vf_nodes_cfg - move VF nodes outside LAG netdev
> > event context
> > > + * @lag: local lag struct
> > > + * @src_prt: lport value for source port
> > > + * @dst_prt: lport value for destination port
> > > + *
> > > + * This function is used to move nodes during an out-of-netdev-event
> > situation,
> > > + * primarily when the driver needs to reconfigure or recreate resources.
> > > + *
> > > + * Must be called while holding the lag_mutex to avoid lag events from
> > > + * processing while out-of-sync moves are happening.  Also, paired
> > moves,
> > > + * such as used in a reset flow, should both be called under the same
> > mutex
> > > + * lock to avoid changes between start of reset and end of reset.
> > > + */
> > > +void ice_lag_move_vf_nodes_cfg(struct ice_lag *lag, u8 src_prt, u8
> > dst_prt)
> > > +{
> > > +	struct ice_lag_netdev_list ndlist, *nl;
> > > +	struct list_head *tmp, *n;
> > > +	struct net_device *tmp_nd;
> > > +
> > > +	INIT_LIST_HEAD(&ndlist.node);
> > > +	rcu_read_lock();
> > > +	for_each_netdev_in_bond_rcu(lag->upper_netdev, tmp_nd) {
> > 
> > Why do you need rcu section for that?
> > 
> > under mutex? lacking context here.
> > 
> 
> Mutex lock is to stop lag event thread from processing any other event until
> after the asynchronous reset is processed.  RCU lock is to stop upper kernel
> bonding driver from changing elements while reset is building a list.

Can you point me to relevant piece of code for upper kernel bonding
driver? Is there synchronize_rcu() on updates?
> 
> > > +		nl = kzalloc(sizeof(*nl), GFP_ATOMIC);
> > 
> > do these have to be new allocations or could you just use list_move?
> > 
> 
> Building a list from scratch - nothing to move until it is created.

Sorry got confused.

Couldn't you keep the up-to-date list of netdevs instead? And avoid all
the building list and then deleting it after processing event?

> 
> > > +		if (!nl)
> > > +			break;
> > > +
> > > +		nl->netdev = tmp_nd;
> > > +		list_add(&nl->node, &ndlist.node);
> > 
> > list_add_rcu ?
> > 
> 
> I thought list_add_rcu was for internal list manipulation when prev and next
> Are both known and defined?

First time I hear this TBH but disregard the suggestion.

> 
> > > +	}
> > > +	rcu_read_unlock();
> > 
> > you have the very same chunk of code in ice_lag_move_new_vf_nodes().
> > pull
> > this out to common function?
> > 
> > ...and in ice_lag_rebuild().
> > 
> 
> Nice catch - for v2, pulled out code into two helper function:
> ice_lag_build_netdev_list()
> Iie_lag_destroy_netdev_list()
> 
> 
> > > +	lag->netdev_head = &ndlist.node;
> > > +	ice_lag_move_vf_nodes(lag, src_prt, dst_prt);
> > > +
> > > +	list_for_each_safe(tmp, n, &ndlist.node) {
> > 
> > use list_for_each_entry_safe()
> > 
> 
> Changed in v2.
> 
> > > +		nl = list_entry(tmp, struct ice_lag_netdev_list, node);
> > > +		list_del(&nl->node);
> > > +		kfree(nl);
> > > +	}
> > > +	lag->netdev_head = NULL;
> 
> Thanks for the review!
> DaveE