[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALf+9YcyxRisLbPqn0uy-tRhtUFWNxjyzxSwyONmNe2AV-EV=Q@mail.gmail.com>
Date: Tue, 21 Jan 2025 12:03:20 -0600
From: Vinay Banakar <vny@...gle.com>
To: Byungchul Park <byungchul@...com>, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
willy@...radead.org
Cc: akpm@...ux-foundation.org, mgorman@...e.de, Wei Xu <weixugc@...gle.com>,
Greg Thelen <gthelen@...gle.com>, kernel_team@...ynix.com
Subject: Re: [PATCH] mm: Optimize TLB flushes during page reclaim
On Mon, Jan 20, 2025 at 7:44 PM Byungchul Park <byungchul@...com> wrote:
> The *interesting* IPIs will be reduced by 1/512 at most. Can we see the
improvement number?
Yes, we reduce IPIs by a factor of 512 by sending one IPI (for TLB
flush) per PMD rather than per page. Since shrink_folio_list()
operates on one PMD at a time, I believe we can safely batch these
operations here.
Here's a concrete example:
When swapping out 20 GiB (5.2M pages):
- Current: Each page triggers an IPI to all cores
- With 6 cores: 31.4M total interrupts (6 cores × 5.2M pages)
- With patch: One IPI per PMD (512 pages)
- Only 10.2K IPIs required (5.2M/512)
- With 6 cores: 61.4K total interrupts
- Results in ~99% reduction in total interrupts
Application performance impact varies by workload, but here's a
representative test case:
- Thread 1: Continuously accesses a 2 GiB private anonymous map (64B
chunks at random offsets)
- Thread 2: Pinned to different core, uses MADV_PAGEOUT on 20 GiB
private anonymous map to swap it out to SSD
- The threads only access their respective maps.
Results:
- Without patch: Thread 1 sees ~53% throughput reduction during
swap. If there are multiple worker threads (like thread 1), the
cumulative throughput degradation will be much higher
- With patch: Thread 1 maintains normal throughput
I expect a similar application performance impact when memory reclaim
is triggered by kswapd.
Powered by blists - more mailing lists