netdev - Re: [net-next v3 00/10] page_pool: Add page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f5206993-c967-eec2-4679-6054f954c271@gmail.com>
Date:   Thu, 3 Feb 2022 21:21:17 +0200
From:   Tariq Toukan <ttoukan.linux@...il.com>
To:     Joe Damato <jdamato@...tly.com>,
        Jesper Dangaard Brouer <jbrouer@...hat.com>
Cc:     netdev@...r.kernel.org, kuba@...nel.org,
        ilias.apalodimas@...aro.org, davem@...emloft.net, hawk@...nel.org,
        Saeed Mahameed <saeed@...nel.org>, brouer@...hat.com
Subject: Re: [net-next v3 00/10] page_pool: Add page_pool stat counters



On 2/2/2022 7:30 PM, Joe Damato wrote:
> On Wed, Feb 2, 2022 at 6:31 AM Jesper Dangaard Brouer
> <jbrouer@...hat.com> wrote:
>>
>>
>> Adding Cc. Tariq and Saeed, as they wanted page_pool stats in the past.
>>
>> On 02/02/2022 02.12, Joe Damato wrote:
>>> Greetings:
>>>
>>> Sending a v3 as I noted some issues with the procfs code in patch 10 I
>>> submit in v2 (thanks, kernel test robot) and fixing the placement of the
>>> refill stat increment in patch 8.
>>
>> Could you explain why a single global stats (/proc/net/page_pool_stat)
>> for all page_pool instances for all RX-queues makes sense?
>>
>> I think this argument/explanation belongs in the cover letter.
> 
> I included an explanation in the v2 cover letter where those changes
> occurred, but you are right: I should have also included it in the v3
> cover letter.
> 
> My thought process was this:
> 
> - Stats now have to be enabled by an explicit kernel config option, so
> the user has to know what they are doing
> - Advanced users can move softirqs to CPUs as they wish and they could
> isolate a particular set of RX-queues on a set of CPUs this way
> - The result is that there is no need to expose anything to the
> drivers and no modifications to drivers are necessary once the single
> kernel config option is enabled and softirq affinity is configured
> 
> I had assumed by not exposing new APIs / page pool internals and by
> not requiring drivers to make any changes, I would have a better shot
> of getting my patches accepted.
> 
> It sounds like both you and Ilias strongly prefer per-pool-per-cpu
> stats, so I can make that change in the v4.
> 
>> What are you using this for?
> 
> I currently graph NIC driver stats from a number of different vendors
> to help better understand the performance of those NICs under my
> company's production workload.
> 
> For example, on i40e, I submit changes to the upstream driver [1] and
> am graphing those stats to better understand memory reuse rate. We
> have seen some issues around mm allocation contention in production
> workloads with certain NICs and system architectures.
> 
> My findings with mlx5 have indicated that the proprietary page reuse
> algorithm in the driver, with our workload, does not provide much
> memory re-use, and causes pressure against the kernel's page
> allocator.  The page pool should help remedy this, but without stats I
> don't have a clear way to measure the effect.
> 
> So in short: I'd like to gather and graph stats about the page pool
> API to determine how much impact the page pool API has on page reuse
> for mlx5 in our workload.
> 
Hi Joe, Jesper, Ilias, and all,

We plan to totally remove the in-driver page-cache and fully rely on 
page-pool for the allocations and dma mapping. This did not happen until 
now as the page pool did not support elevated page refcount (multiple 
frags per-page) and stats.

I'm happy to see that these are getting attention! Thanks for investing 
time and effort to push these tasks forward!

>> And do Tariq and Saeeds agree with this single global stats approach?
> 
> I don't know; I hope they'll chime in.
> 

I agree with Jesper and Ilias. Global per-cpu pool stats are very 
limited. There is not much we can do with the super-position of several 
page-pools. IMO, these stats can be of real value only when each cpu has 
a single pool. Otherwise, the summed stats of two or more pools won't 
help much in observability, or debug.

Tariq

> As I mentioned above, I don't really mind which approach is preferred
> by you all. I had assumed that something with fewer external APIs
> would be more likely to be accepted, and so I made that change in v2.
> 
>>> I only modified the placement of the refill stat, but decided to re-run the
>>> benchmarks used in the v2 [1], and the results are:
>>
>> I appreciate that you are running the benchmarks.
> 
> Sure, no worries. As you mentioned in the other thread, perhaps some
> settings need to be adjusted to show more relevant data on faster
> systems.
> 
> When I work on the v4, I will take a look at the benchmarks and
> explain any modifications made to them or their options when
> presenting the test results.
> 
>>> Test system:
> 
> [...]
> 
> Thanks,
> Joe
> 
> [1]: https://patchwork.ozlabs.org/project/intel-wired-lan/cover/1639769719-81285-1-git-send-email-jdamato@fastly.com/