lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9b3af2dd-8b56-4817-b223-c6a85ba80562@nvidia.com>
Date: Wed, 6 Nov 2024 21:23:47 +0200
From: Gal Pressman <gal@...dia.com>
To: Yafang Shao <laoar.shao@...il.com>, Tariq Toukan <ttoukan.linux@...il.com>
Cc: saeedm@...dia.com, tariqt@...dia.com, leon@...nel.org,
 netdev@...r.kernel.org, linux-rdma@...r.kernel.org
Subject: Re: [PATCH] net/mlx5e: Report rx_discards_phy via rx_missed_errors

On 06/11/2024 13:49, Yafang Shao wrote:
> On Wed, Nov 6, 2024 at 5:56 PM Tariq Toukan <ttoukan.linux@...il.com> wrote:
>>
>>
>>
>> On 06/11/2024 8:40, Yafang Shao wrote:
>>> We observed a high number of rx_discards_phy events on some servers when
>>> running `ethtool -S`. However, this important counter is not currently
>>> reflected in the /proc/net/dev statistics file, making it challenging to
>>> monitor effectively.
>>>
>>> Since rx_missed_errors represents packets dropped due to buffer exhaustion,
>>> it makes sense to include rx_discards_phy in this counter to enhance
>>> monitoring visibility. This change will help administrators track these
>>> events more effectively through standard interfaces.
>>>
>>
>> Hi,
>>
>> Thanks for your patch.
>>
>> It's a matter of interpretation...
>> The documentation in
>> Documentation/ABI/testing/sysfs-class-net-statistics refers to the
>> driver for the exact meaning.

I think this documentation is outdated, a more recent one is in if_link.h:

 * @rx_missed_errors: Count of packets missed by the host.
 *   Folded into the "drop" counter in `/proc/net/dev`.
 *
 *   Counts number of packets dropped by the device due to lack
 *   of buffer space. This usually indicates that the host interface
 *   is slower than the network interface, or host is not keeping up
 *   with the receive packet rate.
 *
 *   This statistic corresponds to hardware events and is not used
 *   on software devices.

>>
>> rx_discards_phy counts packet drops due to exhaustion of the physical
>> port memory (not in the host), this happen way before steering the
>> packet to any receive queue.
>> Today, rx_missed_errors counts SW/host memory buffer exhaustion of the
>> receive queues.
>> I don't think that rx_missed_errors should mix both.
> 
> Thanks for your detailed explanation.
> 
>>
>> Maybe some other counter can be used for rx_discards_phy, like
>> rx_fifo_errors?
> 
> It appears that rx_fifo_errors is a more appropriate counter for this purpose.
> I will submit a v2. Thanks for your suggestion.

Probably not a good idea:
 *   This statistics was used interchangeably with @rx_over_errors.
 *   Not recommended for use in drivers for high speed interfaces.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ