[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <18abb6456fb4a2fba52f6f77373ac351651a62c6.camel@mellanox.com>
Date: Mon, 26 Aug 2019 20:14:47 +0000
From: Saeed Mahameed <saeedm@...lanox.com>
To: "jakub.kicinski@...ronome.com" <jakub.kicinski@...ronome.com>
CC: "davem@...emloft.net" <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Moshe Shemesh <moshe@...lanox.com>
Subject: Re: [net-next 4/8] net/mlx5e: Add device out of buffer counter
On Fri, 2019-08-23 at 11:16 -0700, Jakub Kicinski wrote:
> On Fri, 23 Aug 2019 06:00:45 +0000, Saeed Mahameed wrote:
> > On Thu, 2019-08-22 at 18:33 -0700, Jakub Kicinski wrote:
> > > On Thu, 22 Aug 2019 23:35:52 +0000, Saeed Mahameed wrote:
> > > > From: Moshe Shemesh <moshe@...lanox.com>
> > > >
> > > > Added the following packets drop counter:
> > > > Device out of buffer - counts packets which were dropped due to
> > > > full
> > > > device internal receive queue.
> > > > This counter will be shown on ethtool as a new counter called
> > > > dev_out_of_buffer.
> > > > The counter is read from FW by command QUERY_VNIC_ENV.
> > > >
> > > > Signed-off-by: Moshe Shemesh <moshe@...lanox.com>
> > > > Signed-off-by: Saeed Mahameed <saeedm@...lanox.com>
> > >
> > > Sounds like rx_fifo_errors, no? Doesn't rx_fifo_errors count RX
> > > overruns?
> >
> > No, that is port buffer you are looking for and we got that fully
> > covered in mlx5. this is different.
> >
> > This new counter is deep into the HW data path pipeline and it
> > covers
> > very rare and complex scenarios that got only recently introduced
> > with
> > swichdev mode and "some" lately added tunnels offloads that are
> > routed
> > between VFs/PFs.
> >
> > Normally the HW is lossless once the packet passes port buffers
> > into
> > the data plane pipeline, let's call that "fast lane", BUT for sriov
> > configurations with switchdev mode enabled and some special hand
> > crafted tc tunnel offloads that requires hairpin between VFs/PFs,
> > the
> > hw might decide to send some traffic to a "service lane" which is
> > still
> > fast path but unlike the "fast lane" it handles traffic through "HW
> > internal" receive and send queues (just like we do with hairpin)
> > that
> > might drop packets. the whole thing is transparent to driver and it
> > is
> > HW implementation specific.
>
> I see thanks for the explanation and sorry for the delayed response.
> Would it perhaps make sense to indicate the hairpin in the name?
We had some internal discussion and we couldn't come up with the
perfect name :)
hairpin is just an implementation detail, we don't want to exclusively
bind this counter to hairpin only flows, the problem is not with
hairpin, the actual problem is due to the use of internal RQs, for now
it only happens with "hairpin like" flows, but tomorrow it can happen
with a different scenario but same root cause (the use of internal
RQs), we want to have one counter to count internal drops due to
internal use of internal RQs.
so how about:
dev_internal_rq_oob: Device Internal RQ out of buffer
dev_internal_out_of_res: Device Internal out of resources (more generic
? too generic ?)
Any suggestion that you provide will be more than welcome.
> dev_out_of_buffer is quite a generic name, and there seems to be no
> doc, nor does the commit message explains it as well as you have..
Regarding documentation:
All mlx5 ethool counters are documented here
https://community.mellanox.com/s/article/understanding-mlx5-linux-counters-and-status-parameters
once we decide on the name, will add the new counter to the doc.
Powered by blists - more mailing lists