[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230413164434.GT17993@unreal>
Date: Thu, 13 Apr 2023 19:44:34 +0300
From: Leon Romanovsky <leon@...nel.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Shannon Nelson <shannon.nelson@....com>, brett.creeley@....com,
davem@...emloft.net, netdev@...r.kernel.org, drivers@...sando.io,
jiri@...nulli.us
Subject: Re: [PATCH v9 net-next 13/14] pds_core: publish events to the clients
On Thu, Apr 13, 2023 at 08:14:10AM -0700, Jakub Kicinski wrote:
> On Thu, 13 Apr 2023 11:55:01 +0300 Leon Romanovsky wrote:
> > > > > diff --git a/drivers/net/ethernet/amd/pds_core/adminq.c b/drivers/net/ethernet/amd/pds_core/adminq.c
> > > > > index 25c7dd0d37e5..bb18ac1aabab 100644
> > > > > --- a/drivers/net/ethernet/amd/pds_core/adminq.c
> > > > > +++ b/drivers/net/ethernet/amd/pds_core/adminq.c
> > > > > @@ -27,11 +27,13 @@ static int pdsc_process_notifyq(struct pdsc_qcq *qcq)
> > > > > case PDS_EVENT_LINK_CHANGE:
> > > > > dev_info(pdsc->dev, "NotifyQ LINK_CHANGE ecode %d eid %lld\n",
> > > > > ecode, eid);
> > > > > + pdsc_notify(PDS_EVENT_LINK_CHANGE, comp);
> > > >
> > > > Aren't you "resending" standard netdev event?
> > > > It will be better to send only custom, specific to pds_core events,
> > > > while leaving general ones to netdev.
> > >
> > > We have no netdev in pds_core, so we have to publish this to clients that
> > > might have a netdev or some other need to know.
> >
> > I don't know netdev well enough if it is ok or not and maybe netdev will
> > sent this LINK_CHANGE by itself anyway.
> >
> > Jakub???
>
> I actually prefer for the driver to distribute the event via its own
> means than some random borderline proprietary stuff outside of netdev
> using netdev events.
ok
>
> > > > We can argue if clients should get this event. Once reset is detected,
> > > > the pds_core should close devices by deleting aux drivers.
> > >
> > > We can get a reset signal from the device when it has done a crash recovery
> > > or when it is preparing to do an update, and this allows clients to quiesce
> > > their operations when reset.state==0 and restart when they see
> > > reset.state==1
> >
> > I don't think that it is safe behaviour from user POV. If FW resets
> > itself under the hood, how can client be sure that nothing changes
> > in its operation? Once FW reset occurs, it is much safer for the clients
> > to reconfigure everything.
>
> What's the argument exactly? We do have async resets including in mlx5,
> grep for enable_remote_dev_reset
I think that it is different. I'm complaining that during FW reset,
auxiliary devices are not recreated and continue to be connected to
physical device with a hope that everything will continue to work from
kernel and FW perspective.
It is different from enable_remote_dev_reset, where someone externally
resets device which will trigger mlx5_device_rescan() routine through
mlx5_unload_one->mlx5_load_one sequence.
Thanks
Powered by blists - more mailing lists