[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aBKZILBdDfx-Gwi3@google.com>
Date: Wed, 30 Apr 2025 14:41:52 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: James Houghton <jthoughton@...gle.com>
Cc: axelrasmussen@...gle.com, cgroups@...r.kernel.org, dmatlack@...gle.com,
hannes@...xchg.org, kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
mkoutny@...e.com, mlevitsk@...hat.com, tj@...nel.org, yosry.ahmed@...ux.dev,
yuzhao@...gle.com
Subject: Re: [PATCH v3 5/5] KVM: selftests: access_tracking_perf_test: Use
MGLRU for access tracking
On Tue, Apr 29, 2025, James Houghton wrote:
> On Mon, Apr 28, 2025 at 9:19 PM Sean Christopherson <seanjc@...gle.com> wrote:
> > Using MGLRU on my home box fails. It's full cgroup v2, and has both
> > CONFIG_IDLE_PAGE_TRACKING=y and MGLRU enabled.
> >
> > ==== Test Assertion Failure ====
> > access_tracking_perf_test.c:244: false
> > pid=114670 tid=114670 errno=17 - File exists
> > 1 0x00000000004032a9: find_generation at access_tracking_perf_test.c:244
> > 2 0x00000000004032da: lru_gen_mark_memory_idle at access_tracking_perf_test.c:272
> > 3 0x00000000004034e4: mark_memory_idle at access_tracking_perf_test.c:391
> > 4 (inlined by) run_test at access_tracking_perf_test.c:431
> > 5 0x0000000000403d84: for_each_guest_mode at guest_modes.c:96
> > 6 0x0000000000402c61: run_test_for_each_guest_mode at access_tracking_perf_test.c:492
> > 7 0x000000000041d8e2: cg_run at cgroup_util.c:382
> > 8 0x00000000004027fa: main at access_tracking_perf_test.c:572
> > 9 0x00007fa1cb629d8f: ?? ??:0
> > 10 0x00007fa1cb629e3f: ?? ??:0
> > 11 0x00000000004029d4: _start at ??:?
> > Could not find a generation with 90% of guest memory (235929 pages).
> >
> > Interestingly, if I force the test to use /sys/kernel/mm/page_idle/bitmap, it
> > passes.
> >
> > Please try to reproduce the failure (assuming you haven't already tested that
> > exact combination of cgroups v2, MGLRU=y, and CONFIG_IDLE_PAGE_TRACKING=y). I
> > don't have bandwidth to dig any further at this time.
>
> Sorry... please see the bottom of this message for a diff that should fix this.
> It fixes these bugs:
>
> 1. Tracking generation numbers without hardware Accessed bit management.
> (This is addition of lru_gen_last_gen.)
> 1.5 It does an initial aging pass so that pages always move to newer
> generations in (or before) the subsequent aging passes. This probably
> isn't needed given the change I made for (1).
> 2. Fixes the expected number of pages for guest page sizes > PAGE_SIZE.
> (This is the move of test_pages. test_pages has also been renamed to avoid
> shadowing.)
> 3. Fixes an off-by-one error when looking for the generation with the most
> pages. Previously it failed to check the youngest generation, which I think
> is the bug you ran into. (This is the change to lru_gen_util.c.)
Ya, this was the bug I initially ran into, I also encountered more failues after
applying just that fix. But, with the full diff applied, it's passing, so good
to go for the next version from my end.
Powered by blists - more mailing lists