[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20230329073313.GG831478@unreal>
Date: Wed, 29 Mar 2023 10:33:13 +0300
From: Leon Romanovsky <leon@...nel.org>
To: Veerasenareddy Burru <vburru@...vell.com>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Abhijit Ayarekar <aayarekar@...vell.com>,
Sathesh B Edara <sedara@...vell.com>,
Satananda Burla <sburla@...vell.com>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>
Subject: Re: [EXT] Re: [PATCH net-next v4 8/8] octeon_ep: add heartbeat
monitor
On Thu, Mar 23, 2023 at 06:14:10PM +0000, Veerasenareddy Burru wrote:
>
>
> > -----Original Message-----
> > From: Leon Romanovsky <leon@...nel.org>
> > Sent: Thursday, March 23, 2023 3:47 AM
> > To: Veerasenareddy Burru <vburru@...vell.com>
> > Cc: netdev@...r.kernel.org; linux-kernel@...r.kernel.org; Abhijit Ayarekar
> > <aayarekar@...vell.com>; Sathesh B Edara <sedara@...vell.com>;
> > Satananda Burla <sburla@...vell.com>; linux-doc@...r.kernel.org; David S.
> > Miller <davem@...emloft.net>; Eric Dumazet <edumazet@...gle.com>;
> > Jakub Kicinski <kuba@...nel.org>; Paolo Abeni <pabeni@...hat.com>
> > Subject: [EXT] Re: [PATCH net-next v4 8/8] octeon_ep: add heartbeat
> > monitor
> >
> > External Email
> >
> > ----------------------------------------------------------------------
> > On Wed, Mar 22, 2023 at 02:19:57AM -0700, Veerasenareddy Burru wrote:
> > > Monitor periodic heartbeat messages from device firmware.
> > > Presence of heartbeat indicates the device is active and running.
> > > If the heartbeat is missed for configured interval indicates firmware
> > > has crashed and device is unusable; in this case, PF driver stops and
> > > uninitialize the device.
> > >
> > > Signed-off-by: Veerasenareddy Burru <vburru@...vell.com>
> > > Signed-off-by: Abhijit Ayarekar <aayarekar@...vell.com>
> > > ---
> > > v3 -> v4:
> > > * 0007-xxx.patch in v3 is 0008-xxx.patch in v4.
> > >
> > > v2 -> v3:
> > > * 0009-xxx.patch in v2 is now 0007-xxx.patch in v3 due to
> > > 0007 and 0008.patch from v2 are removed in v3.
> > >
> > > v1 -> v2:
> > > * no change
<...>
> > > + struct octep_device *oct = container_of(work, struct octep_device,
> > > + hb_task.work);
> > > +
> > > + int miss_cnt;
> > > +
> > > + atomic_inc(&oct->hb_miss_cnt);
> > > + miss_cnt = atomic_read(&oct->hb_miss_cnt);
> >
> > miss_cnt = atomic_inc_return(&oct->hb_miss_cnt);
> >
>
> Thanks for the feedback. Will fix it.
>
> > > + if (miss_cnt < oct->conf->max_hb_miss_cnt) {
> >
> > How is this heartbeat working? You increment on every entry to
> > octep_hb_timeout_task(), After max_hb_miss_cnt invocations, you will stop
> > your device.
> >
> > Thanks
> >
>
> Yes, device will be stopped after max_hb_miss_cnt heartbeats are missed.
If I read code correctly, device will stop after octep_hb_timeout_task()
calls which happens every msecs_to_jiffies(oct->conf->hb_interval * 1000.
You don't cancel/resechdule job if timeout doesn't happen.
Thanks
>
> > > + queue_delayed_work(octep_wq, &oct->hb_task,
> > > + msecs_to_jiffies(oct->conf->hb_interval *
> > 1000));
Powered by blists - more mailing lists