[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZBkS+oLNkFAjcDBn@euler>
Date: Mon, 20 Mar 2023 19:14:18 -0700
From: Colin Foster <colin.foster@...advantage.com>
To: Vladimir Oltean <vladimir.oltean@....com>
Cc: netdev@...r.kernel.org, Andrew Lunn <andrew@...n.ch>,
Florian Fainelli <f.fainelli@...il.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Claudiu Manoil <claudiu.manoil@....com>,
Alexandre Belloni <alexandre.belloni@...tlin.com>,
UNGLinuxDriver@...rochip.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net 1/3] net: mscc: ocelot: fix stats region batching
On Tue, Mar 21, 2023 at 03:03:23AM +0200, Vladimir Oltean wrote:
> The blamed commit changed struct ocelot_stat_layout :: "u32 offset" to
> "u32 reg".
>
> However, "u32 reg" is not quite a register address, but an enum
> ocelot_reg, which in itself encodes an enum ocelot_target target in the
> upper bits, and an index into the ocelot->map[target][] array in the
> lower bits.
>
> So, whereas the previous code comparison between stats_layout[i].offset
> and last + 1 was correct (because those "offsets" at the time were
> 32-bit relative addresses), the new code, comparing layout[i].reg to
> last + 4 is not correct, because the "reg" here is an enum/index, not an
> actual register address.
>
> What we want to compare are indeed register addresses, but to do that,
> we need to actually go through the same motions as
> __ocelot_bulk_read_ix() itself.
>
> With this bug, all statistics counters are deemed by
> ocelot_prepare_stats_regions() as constituting their own region.
> (Truncated) log on VSC9959 (Felix) below (prints added by me):
>
> Before:
>
> region of 1 contiguous counters starting with SYS:STAT:CNT[0x000]
> region of 1 contiguous counters starting with SYS:STAT:CNT[0x001]
> region of 1 contiguous counters starting with SYS:STAT:CNT[0x002]
> ...
> region of 1 contiguous counters starting with SYS:STAT:CNT[0x041]
> region of 1 contiguous counters starting with SYS:STAT:CNT[0x042]
> region of 1 contiguous counters starting with SYS:STAT:CNT[0x080]
> region of 1 contiguous counters starting with SYS:STAT:CNT[0x081]
> ...
> region of 1 contiguous counters starting with SYS:STAT:CNT[0x0ac]
> region of 1 contiguous counters starting with SYS:STAT:CNT[0x100]
> region of 1 contiguous counters starting with SYS:STAT:CNT[0x101]
> ...
> region of 1 contiguous counters starting with SYS:STAT:CNT[0x111]
>
> After:
>
> region of 67 contiguous counters starting with SYS:STAT:CNT[0x000]
> region of 45 contiguous counters starting with SYS:STAT:CNT[0x080]
> region of 18 contiguous counters starting with SYS:STAT:CNT[0x100]
Yes, I verified this with:
`trace-cmd record -p function_graph -l ocelot_* sleep 3`
Before the patch series, on the VSC7512 a call to
ocelot_port_update_stats() takes about 14ms, with many calls to
ocelot_spi_regmap_bus_read().
After the patch series, the calls take about 2ms, with four calls to
ocelot_spi_regmap_bus_read().
Acked-by: Colin Foster <colin.foster@...advantage.com>
Tested-by: Colin Foster <colin.foster@...advantage.com>
Powered by blists - more mailing lists