[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <MW5PR11MB5811C4226F1402FD29EC96CEDDE69@MW5PR11MB5811.namprd11.prod.outlook.com>
Date: Fri, 16 Dec 2022 19:08:40 +0000
From: "Ertman, David M" <david.m.ertman@...el.com>
To: Saeed Mahameed <saeed@...nel.org>
CC: "Nguyen, Anthony L" <anthony.l.nguyen@...el.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"kuba@...nel.org" <kuba@...nel.org>,
"pabeni@...hat.com" <pabeni@...hat.com>,
"edumazet@...gle.com" <edumazet@...gle.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"Saleem, Shiraz" <shiraz.saleem@...el.com>,
"Ismail, Mustafa" <mustafa.ismail@...el.com>,
"jgg@...dia.com" <jgg@...dia.com>,
"leonro@...dia.com" <leonro@...dia.com>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"G, GurucharanX" <gurucharanx.g@...el.com>
Subject: RE: [PATCH net 2/4] ice: git send-email --suppress-cc=all --to
e1000-patches@...ists.intel.com
> -----Original Message-----
> From: Saeed Mahameed <saeed@...nel.org>
> Sent: Friday, December 9, 2022 11:28 AM
> To: Ertman, David M <david.m.ertman@...el.com>
> Cc: Nguyen, Anthony L <anthony.l.nguyen@...el.com>;
> davem@...emloft.net; kuba@...nel.org; pabeni@...hat.com;
> edumazet@...gle.com; netdev@...r.kernel.org; Saleem, Shiraz
> <shiraz.saleem@...el.com>; Ismail, Mustafa <mustafa.ismail@...el.com>;
> jgg@...dia.com; leonro@...dia.com; linux-rdma@...r.kernel.org; G,
> GurucharanX <gurucharanx.g@...el.com>
> Subject: Re: [PATCH net 2/4] ice: Correctly handle aux device when num
> channels change
>
> On 09 Dec 17:21, Ertman, David M wrote:
> >> -----Original Message-----
> >> From: Saeed Mahameed <saeed@...nel.org>
> >> Sent: Wednesday, December 7, 2022 2:26 PM
> >> To: Nguyen, Anthony L <anthony.l.nguyen@...el.com>
> >> Cc: davem@...emloft.net; kuba@...nel.org; pabeni@...hat.com;
> >> edumazet@...gle.com; Ertman, David M <david.m.ertman@...el.com>;
> >> netdev@...r.kernel.org; Saleem, Shiraz <shiraz.saleem@...el.com>;
> Ismail,
> >> Mustafa <mustafa.ismail@...el.com>; jgg@...dia.com;
> leonro@...dia.com;
> >> linux-rdma@...r.kernel.org; G, GurucharanX
> <gurucharanx.g@...el.com>
> >> Subject: Re: [PATCH net 2/4] ice: Correctly handle aux device when num
> >> channels change
> >>
> >> On 07 Dec 13:10, Tony Nguyen wrote:
> >> >From: Dave Ertman <david.m.ertman@...el.com>
> >> >
> >> >When the number of channels/queues changes on an interface, it is
> >> necessary
> >> >to change how those resources are distributed to the auxiliary device for
> >> >maintaining RDMA functionality. To do this, the best way is to unplug,
> and
> >>
> >> Can you please explain how an ethtool can affect RDMA functionality ?
> >> don't you have full bifurcation between the two eth and rdma interfaces
> ..
> >>
> >This patch is to address a bug where the number of queues for the
> interface was
> >changed and the RDMA lost functionality due to queues being re-assigned.
> >
>
> sounds like an rdma or PF bug.
>
> >The PF is managing and setting aside resources for the RDMA aux dev. Then
> the
> >RDMA aux driver will request resources from the PF driver. Changes in
> >the total number of resources make it so that resources previously
> >allocated to RDMA aux driver may not be available any more. A re-
> allocation
> >is necessary to ensure that RDMA has all of the queues that it thinks it does.
> >
>
> IMO it's wrong to re-initialize a parallel subsystems due to an ethtool,
> ethtool is meant to control the netdev interface, not rdma. Either
> statically allocate these resources on boot or just store them in a free
> pool in PF until next time rdma reloads by an rdma user command outside of
> netdev.
>
> >> >then re-plug the auxiliary device. This will cause all current resource
> >> >allocation to be released, and then re-requested under the new state.
> >> >
> >>
> >> I find this really disruptive, changing number of netdev queues to cause
> >> full aux devs restart !
> >>
> >
> >Changing the number of queues available to the interface *is* a disruptive
> action.
>
> yes to the netdev, not to rdma or usb, or the pci bus, you get my point.
>
> >The netdev and VSI have to be re-configured for queues per TC and the
> RDMA aux
> >driver has to re-allocate qsets to attach queue-pairs to.
> >
>
> sure for netdev, but it doesn't make sense to reload rdma, if rdma was
> using X queues, it should continue using X queues.. if you can't support
> dynamic netdev changes without disturbing rdma, then block ethtool until
> user unloads rdma.
>
> >> >Since the set_channel command from ethtool comes in while holding
> the
> >> RTNL
> >> >lock, it is necessary to offset the plugging and unplugging of auxiliary
> >> >device to another context. For this purpose, set the flags for UNPLUG
> and
> >> >PLUG in the PF state, then respond to them in the service task.
> >> >
> >> >Also, since the auxiliary device will be unplugged/plugged at the end of
> >> >the flow, it is better to not send the event for TCs changing in the
> >> >middle of the flow. This will prevent a timing issue between the events
> >> >and the probe/release calls conflicting.
> >> >
> >> >Fixes: 348048e724a0 ("ice: Implement iidc operations")
> >> >Signed-off-by: Dave Ertman <david.m.ertman@...el.com>
> >> >Tested-by: Gurucharan G <gurucharanx.g@...el.com> (A Contingent
> >> worker at Intel)
> >> >Signed-off-by: Tony Nguyen <anthony.l.nguyen@...el.com>
> >> >---
> >> > drivers/net/ethernet/intel/ice/ice.h | 2 ++
> >> > drivers/net/ethernet/intel/ice/ice_ethtool.c | 6 ++++++
> >> > drivers/net/ethernet/intel/ice/ice_idc.c | 3 +++
> >> > drivers/net/ethernet/intel/ice/ice_main.c | 3 +++
> >> > 4 files changed, 14 insertions(+)
> >> >
> >> >diff --git a/drivers/net/ethernet/intel/ice/ice.h
> >> b/drivers/net/ethernet/intel/ice/ice.h
> >> >index 001500afc4a6..092e572768fe 100644
> >> >--- a/drivers/net/ethernet/intel/ice/ice.h
> >> >+++ b/drivers/net/ethernet/intel/ice/ice.h
> >> >@@ -281,6 +281,7 @@ enum ice_pf_state {
> >> > ICE_FLTR_OVERFLOW_PROMISC,
> >> > ICE_VF_DIS,
> >> > ICE_CFG_BUSY,
> >> >+ ICE_SET_CHANNELS,
> >> > ICE_SERVICE_SCHED,
> >> > ICE_SERVICE_DIS,
> >> > ICE_FD_FLUSH_REQ,
> >> >@@ -485,6 +486,7 @@ enum ice_pf_flags {
> >> > ICE_FLAG_VF_VLAN_PRUNING,
> >> > ICE_FLAG_LINK_LENIENT_MODE_ENA,
> >> > ICE_FLAG_PLUG_AUX_DEV,
> >> >+ ICE_FLAG_UNPLUG_AUX_DEV,
> >> > ICE_FLAG_MTU_CHANGED,
> >> > ICE_FLAG_GNSS, /* GNSS successfully
> >> initialized */
> >> > ICE_PF_FLAGS_NBITS /* must be last */
> >> >diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c
> >> b/drivers/net/ethernet/intel/ice/ice_ethtool.c
> >> >index b7be84bbe72d..37e174a19860 100644
> >> >--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
> >> >+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
> >> >@@ -3536,6 +3536,8 @@ static int ice_set_channels(struct net_device
> >> *dev, struct ethtool_channels *ch)
> >> > return -EINVAL;
> >> > }
> >> >
> >> >+ set_bit(ICE_SET_CHANNELS, pf->state);
> >> >+
> >> > ice_vsi_recfg_qs(vsi, new_rx, new_tx);
> >> >
> >> > if (!netif_is_rxfh_configured(dev))
> >> >@@ -3543,6 +3545,10 @@ static int ice_set_channels(struct net_device
> >> *dev, struct ethtool_channels *ch)
> >> >
> >> > /* Update rss_size due to change in Rx queues */
> >> > vsi->rss_size = ice_get_valid_rss_size(&pf->hw, new_rx);
> >> >+ clear_bit(ICE_SET_CHANNELS, pf->state);
> >> >+
> >>
> >> you just set this new state a few lines ago, clearing the bit in the same
> >> function few lines later seems to be an abuse of the pf state machine,
> >> couldn't you just pass a parameter to the functions which needed this
> >> information ?
> >>
> >
> >How is this abusing the PF state machine? There is a 3 deep function call
> that needs
> >the information that this is a set_channel context, and each of those
> functions is called
> >from several locations - how is changing all of those functions to include a
> parameter
>
> this is exactly the abuse i was talking about, setting a flag on a global
> state field because the function call is too deep.
>
> >(that will be false for all of them but this instance) be less abusive than
> setting and
> >clearing a bit?
>
> this is not future sustainable and not reviewer friendly, it's a local
> parameter and shouldn't be a global flag.
>
> Anyway this is your driver, i am not going to force you to do something you
> don't like.
>
> but for the reloading of rdma due to an ethtool is a no for me..
> let's find a common place for all vendors to express such limitations,
> e.g. devlink ..
The set_channels function will cause the VSI controlling the pool of resources that
RDMA uses to be re-allocated ad reconfigured. So, based on this, I will submit a new patch
to prevent set_channels when RDMA is probed as you suggested.
Please abandon this patch, new one will be submitted.
DaveE
>
> >
> >> >+ set_bit(ICE_FLAG_UNPLUG_AUX_DEV, pf->flags);
> >> >+ set_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags);
> >> >
> >> > return 0;
> >> > }
> >
Powered by blists - more mailing lists