[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8bf64451-fd0c-4365-8d47-71792a9603a5@intel.com>
Date: Tue, 27 Feb 2024 11:49:22 -0800
From: "Nambiar, Amritha" <amritha.nambiar@...el.com>
To: Jakub Kicinski <kuba@...nel.org>, Stanislav Fomichev <sdf@...gle.com>
CC: <davem@...emloft.net>, <netdev@...r.kernel.org>, <edumazet@...gle.com>,
<pabeni@...hat.com>, <danielj@...dia.com>, <mst@...hat.com>,
<michael.chan@...adcom.com>, <vadim.fedorenko@...ux.dev>
Subject: Re: [PATCH net-next 1/3] netdev: add per-queue statistics
On 2/27/2024 7:24 AM, Jakub Kicinski wrote:
> On Mon, 26 Feb 2024 19:37:04 -0800 Stanislav Fomichev wrote:
>> On 02/26, Jakub Kicinski wrote:
>>> On Mon, 26 Feb 2024 13:35:34 -0800 Stanislav Fomichev wrote:
>>>> IIUC, in order to get netdev-scoped stats in v1 (vs rfc) is to not set
>>>> stats-scope, right? Any reason we dropped the explicit netdev entry?
>>>> It seems more robust with a separate entry and removes the ambiguity about
>>>> which stats we're querying.
>>>
>>> The change is because I switched from enum to flags.
>>>
>>> I'm not 100% sure which one is going to cause fewer issues down
>>> the line. It's a question of whether the next scope we add will
>>> be disjoint with or subdividing previous scopes.
>>>
>>> I think only subdividing previous scopes makes sense. If we were
>>> to add "stats per NAPI" (bad example) or "per buffer pool" or IDK what
>>> other thing -- we should expose that as a new netlink command, not mix
>>> it with the queues.
>>>
>>> The expectation is that scopes will be extended with hw vs sw, or
>>> per-CPU (e.g. page pool recycling). In which case we'll want flags,
>>> so that we can combine them -- ask for HW stats for a queue or hw
>>> stats for the entire netdev.
>>>
>>> Perhaps I should rename stats -> queue-stats to make this more explicit?
>>>
>>> The initial version I wrote could iterate both over NAPIs and
>>> queues. This could be helpful to some drivers - but I realized that it
>>> would lead to rather painful user experience (does the driver maintain
>>> stats per NAPI or per queue?) and tricky implementation of the device
>>> level sum (device stats = Sum(queue) or Sum(queue) + Sum(NAPI)??)
>>
>> Yeah, same, not sure. The flags may be more flexible but a bit harder
>> wrt discoverability. Assuming a somewhat ignorant spec reader/user,
>> it might be hard to say which flags makes sense to combine and which isn't.
>> Or, I guess, we can try to document it?
>
> We're talking about driver API here, so document and enforce in code
> review :) But fundamentally, I don't think we should be turning this op
> into a mux for all sort of stats. We can have 64k ops in the family.
>
>> For HW vs SW, do you think it makes sense to expose it as a scope?
>> Why not have something like 'rx-packets' and 'hw-rx-packets'?
>
> I had that in one of the WIP versions but (a) a lot of the stats can
> be maintained by either device or the driver, so we'd end up with a hw-
> flavor for most of the entries, and (b) 90% of the time the user will
> not care whether it's the HW or SW that counted the bytes, or GSO
> segments. Similarly to how most of the users will not care about
> per-queue breakdown, TBH, which made me think that from user
> perspective both queue and hw vs sw are just a form of detailed
> breakdown. Majority will dump the combined sw|hw stats for the device.
>
> I could be wrong.
>
I think the per-queue breakdown would be useful as well, especially in
usecases where there are filters directing traffic to different queues.
>> Maybe, as you're suggesting, we should rename stats to queue-states
>> and drop the score for now? When the time comes to add hw counters,
>> we can revisit. For total netdev stats, we can ask the user to aggregate
>> the per-queue ones?
>
> I'd keep the scope, and ability to show the device level aggregation.
> There are drivers (bnxt, off the top of my head, but I feel like there's
> more) which stash the counters when queues get freed. Without the device
> level aggregation we'd need to expose that as "no queue" or "history"
> or "delta" etc stats. I think that's uglier that showing the sum, which
> is what user will care about 99% of the time.
>
+1 to retaining the scope and device level aggregation to reduce ambiguity.
> It'd be a pure rename.
Powered by blists - more mailing lists