[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f5206993-c967-eec2-4679-6054f954c271@gmail.com>
Date: Thu, 3 Feb 2022 21:21:17 +0200
From: Tariq Toukan <ttoukan.linux@...il.com>
To: Joe Damato <jdamato@...tly.com>,
Jesper Dangaard Brouer <jbrouer@...hat.com>
Cc: netdev@...r.kernel.org, kuba@...nel.org,
ilias.apalodimas@...aro.org, davem@...emloft.net, hawk@...nel.org,
Saeed Mahameed <saeed@...nel.org>, brouer@...hat.com
Subject: Re: [net-next v3 00/10] page_pool: Add page_pool stat counters
On 2/2/2022 7:30 PM, Joe Damato wrote:
> On Wed, Feb 2, 2022 at 6:31 AM Jesper Dangaard Brouer
> <jbrouer@...hat.com> wrote:
>>
>>
>> Adding Cc. Tariq and Saeed, as they wanted page_pool stats in the past.
>>
>> On 02/02/2022 02.12, Joe Damato wrote:
>>> Greetings:
>>>
>>> Sending a v3 as I noted some issues with the procfs code in patch 10 I
>>> submit in v2 (thanks, kernel test robot) and fixing the placement of the
>>> refill stat increment in patch 8.
>>
>> Could you explain why a single global stats (/proc/net/page_pool_stat)
>> for all page_pool instances for all RX-queues makes sense?
>>
>> I think this argument/explanation belongs in the cover letter.
>
> I included an explanation in the v2 cover letter where those changes
> occurred, but you are right: I should have also included it in the v3
> cover letter.
>
> My thought process was this:
>
> - Stats now have to be enabled by an explicit kernel config option, so
> the user has to know what they are doing
> - Advanced users can move softirqs to CPUs as they wish and they could
> isolate a particular set of RX-queues on a set of CPUs this way
> - The result is that there is no need to expose anything to the
> drivers and no modifications to drivers are necessary once the single
> kernel config option is enabled and softirq affinity is configured
>
> I had assumed by not exposing new APIs / page pool internals and by
> not requiring drivers to make any changes, I would have a better shot
> of getting my patches accepted.
>
> It sounds like both you and Ilias strongly prefer per-pool-per-cpu
> stats, so I can make that change in the v4.
>
>> What are you using this for?
>
> I currently graph NIC driver stats from a number of different vendors
> to help better understand the performance of those NICs under my
> company's production workload.
>
> For example, on i40e, I submit changes to the upstream driver [1] and
> am graphing those stats to better understand memory reuse rate. We
> have seen some issues around mm allocation contention in production
> workloads with certain NICs and system architectures.
>
> My findings with mlx5 have indicated that the proprietary page reuse
> algorithm in the driver, with our workload, does not provide much
> memory re-use, and causes pressure against the kernel's page
> allocator. The page pool should help remedy this, but without stats I
> don't have a clear way to measure the effect.
>
> So in short: I'd like to gather and graph stats about the page pool
> API to determine how much impact the page pool API has on page reuse
> for mlx5 in our workload.
>
Hi Joe, Jesper, Ilias, and all,
We plan to totally remove the in-driver page-cache and fully rely on
page-pool for the allocations and dma mapping. This did not happen until
now as the page pool did not support elevated page refcount (multiple
frags per-page) and stats.
I'm happy to see that these are getting attention! Thanks for investing
time and effort to push these tasks forward!
>> And do Tariq and Saeeds agree with this single global stats approach?
>
> I don't know; I hope they'll chime in.
>
I agree with Jesper and Ilias. Global per-cpu pool stats are very
limited. There is not much we can do with the super-position of several
page-pools. IMO, these stats can be of real value only when each cpu has
a single pool. Otherwise, the summed stats of two or more pools won't
help much in observability, or debug.
Tariq
> As I mentioned above, I don't really mind which approach is preferred
> by you all. I had assumed that something with fewer external APIs
> would be more likely to be accepted, and so I made that change in v2.
>
>>> I only modified the placement of the refill stat, but decided to re-run the
>>> benchmarks used in the v2 [1], and the results are:
>>
>> I appreciate that you are running the benchmarks.
>
> Sure, no worries. As you mentioned in the other thread, perhaps some
> settings need to be adjusted to show more relevant data on faster
> systems.
>
> When I work on the v4, I will take a look at the benchmarks and
> explain any modifications made to them or their options when
> presenting the test results.
>
>>> Test system:
>
> [...]
>
> Thanks,
> Joe
>
> [1]: https://patchwork.ozlabs.org/project/intel-wired-lan/cover/1639769719-81285-1-git-send-email-jdamato@fastly.com/
Powered by blists - more mailing lists