lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3931a391-3967-4260-a104-4eb313810c0d@ti.com>
Date: Fri, 7 Mar 2025 16:00:40 +0530
From: MD Danish Anwar <danishanwar@...com>
To: Jakub Kicinski <kuba@...nel.org>
CC: Vignesh Raghavendra <vigneshr@...com>, Meghana Malladi <m-malladi@...com>,
        Diogo Ivo <diogo.ivo@...mens.com>, Lee Trager <lee@...ger.us>,
        Andrew Lunn
	<andrew+netdev@...n.ch>,
        Roger Quadros <rogerq@...nel.org>, Jonathan Corbet
	<corbet@....net>,
        Simon Horman <horms@...nel.org>, Paolo Abeni
	<pabeni@...hat.com>,
        Eric Dumazet <edumazet@...gle.com>,
        "David S. Miller"
	<davem@...emloft.net>,
        <linux-arm-kernel@...ts.infradead.org>, <linux-kernel@...r.kernel.org>,
        <linux-doc@...r.kernel.org>, <netdev@...r.kernel.org>, <srk@...com>
Subject: Re: [PATCH net-next v2] net: ti: icssg-prueth: Add ICSSG FW Stats

Hi Jakub

On 07/03/25 6:25 am, Jakub Kicinski wrote:
> On Wed, 5 Mar 2025 16:46:08 +0530 MD Danish Anwar wrote:
>> + - ``FW_RTU_PKT_DROP``: Diagnostic error counter which increments when RTU drops a locally injected packet due to port being disabled or rule violation.
>> + - ``FW_Q0_OVERFLOW``: TX overflow counter for queue0
>> + - ``FW_Q1_OVERFLOW``: TX overflow counter for queue1
>> + - ``FW_Q2_OVERFLOW``: TX overflow counter for queue2
>> + - ``FW_Q3_OVERFLOW``: TX overflow counter for queue3
>> + - ``FW_Q4_OVERFLOW``: TX overflow counter for queue4
>> + - ``FW_Q5_OVERFLOW``: TX overflow counter for queue5
>> + - ``FW_Q6_OVERFLOW``: TX overflow counter for queue6
>> + - ``FW_Q7_OVERFLOW``: TX overflow counter for queue7
> ...
> 
> Thanks for the docs, it looks good. Now, do all of these get included
> in the standard stats returned by icssg_ndo_get_stats64 ?
> That's the primary source of information for the user regarding packet
> loss.

No, these are not reported via icssg_ndo_get_stats64.

.ndo_get_stats64 populates stats that are part of `struct
rtnl_link_stats64`. icssg_ndo_get_stats64 populates those stats wherever
applicable. These firmware stats are not same as the ones defined in
`icssg_ndo_get_stats64` hence they are not populated. They are not
standard stats, they will be dumped by `ethtool -S`. Wherever there is a
standard stats, I will make sure it gets dumped from the standard
interface instead of `ethtool -S`

Only below stats are included in the standard stats returned by
icssg_ndo_get_stats64
- rx_packets
- rx_bytes
- tx_packets
- tx_bytes
- rx_crc_errors
- rx_over_errors
- rx_multicast_frames

> 
>>  	if (prueth->pa_stats) {
>>  		for (i = 0; i < ARRAY_SIZE(icssg_all_pa_stats); i++) {
>> -			reg = ICSSG_FW_STATS_BASE +
>> -			      icssg_all_pa_stats[i].offset *
>> -			      PRUETH_NUM_MACS + slice * sizeof(u32);
>> +			reg = icssg_all_pa_stats[i].offset +
>> +			      slice * sizeof(u32);
>>  			regmap_read(prueth->pa_stats, reg, &val);
>>  			emac->pa_stats[i] += val;
> 
> This gets called by icssg_ndo_get_stats64() which is under RCU 

Yes, this does get called by icssg_ndo_get_stats64(). Apart from that
there is a workqueue (`icssg_stats_work_handler`) that calls this API
periodically and updates the emac->stats and emac->pa_stats arrays.

> protection and nothing else. I don't see any locking here, and

There is no locking here. I don't think this is related to the patch.
The API emac_update_hardware_stats() updates all the stats supported by
ICSSG not just standard stats.

> I hope the regmap doesn't sleep. cat /proc/net/dev to test.
> You probably need to send some fixes to net.

I checked cat /proc/net/dev and the stats shown there are not related to
the stats I am introducing in this series.

The fix could be to add a lock in this function, but we have close to 90
stats and this function is called not only from icssg_ndo_get_stats64()
but from emac_get_ethtool_stats(). The function also gets called
periodically (Every 25 Seconds for 1G Link). I think every time locking
90 regmap_reads() could result in performance degradation.

I only see couple of drivers acquiring spin lock before reading the
stats for .ndo_get_stats64. Most of the drivers are not using any lock.

I did some testing and did not see any discrepancy in the stats `cat
/proc/net/dev` without the lock.

Furthermore, the fix is independent of this patch. I can send out a
separate fix to net to add cpu locks to this function. But I don't think
there is any change needed in this patch.

Let me know what should be done here.

-- 
Thanks and Regards,
Danish

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ