[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250711120232.13452-1-lianux.mm@gmail.com>
Date: Fri, 11 Jul 2025 20:02:00 +0800
From: wang lian <lianux.mm@...il.com>
To: lorenzo.stoakes@...cle.com
Cc: Liam.Howlett@...cle.com,
akpm@...ux-foundation.org,
brauner@...nel.org,
broonie@...nel.org,
david@...hat.com,
gkwang@...x-info.com,
jannh@...gle.com,
lianux.mm@...il.com,
linux-kernel@...r.kernel.org,
linux-kselftest@...r.kernel.org,
linux-mm@...ck.org,
p1ucky0923@...il.com,
ryncsn@...il.com,
shuah@...nel.org,
sj@...nel.org,
vbabka@...e.cz,
zijing.zhang@...ton.me,
ziy@...dia.com
Subject: Re: [PATCH v4] selftests/mm: add process_madvise() tests
Hi Lorenzo Stoakes,
> > Hi Lorenzo Stoakes,
> >
> > >> + *
> > >> + * This test deterministically validates process_madvise() with MADV_COLLAPSE
> > >> + * on a remote process, other advices are difficult to verify reliably.
> > >> + *
> > >> + * The test verifies that a memory region in a child process, initially
> > >> + * backed by small pages, can be collapsed into a Transparent Huge Page by a
> > >> + * request from the parent. The result is verified by parsing the child's
> > >> + * /proc/<pid>/smaps file.
> > >> + */
> >
> > > This is clever and you've put a lot of effort in, but this just seems
> > > absolutely prone to flaking and you're essentially testing something that's
> > > highly automated.
> >
> > > I think you're also going way outside of the realms of testing
> > > process_madvise() and are getting into testing essentially MADV_COLLAPSE
> > > here.
> >
> > > > We have to try to keep the test specific to what it is you're testing -
> > > which is process_madvise() itself.
> >
> > > So for me, and I realise you've put a ton of work into this and I'm really
> > > sorry to say it, I think you should drop this specific test.
> >
> > > For me simply testing the remote MADV_DONTNEED is enough.
> >
> > My motivation for this complex test came from the need to verify that
> > the process_madvise operation was actually successful. Without checking
> > the outcome, the test would only validate that the syscall returns the
> > correct number of bytes, not that the advice truly took effect on the
> > target process's memory.
> >
> > For remote calls, process_madvise is intentionally limited to
> > non-destructive advice: MADV_COLD, MADV_PAGEOUT, MADV_WILLNEED,
> > and MADV_COLLAPSE. However, verifying the effects of COLD, PAGEOUT,
> > and WILLNEED is very difficult to do reliably in a selftest. This left
> > MADV_COLLAPSE as what seemed to be the only verifiable option.
> >
> > But, as you correctly pointed out, MADV_COLLAPSE is too dependent on
> > the system's THP state and prone to races with khugepaged. This is the
> > very issue I tried to work around in v4 after the v3 test failures.
> > So I think this test is necessary.
> > As for your other opinions, I completely agree.
> MADV_COLLAPSE is not a reliable test and we're going to end up with flakes. The
> implementation as-is is unreliable, and I"m not sure there's any way to make it
> not-unreliable.
> This is especially true as we change THP behaviour over time. I don't want to
> see failed test reports because of this.
> I think it might be best to simply assert that the operation succesfully
> completes without checking whether it actually executes the requested task - it
> would render this functionality completely broken if it were not to actually do
> what was requested.
> >
> >
> >
> > Best regards,
> > Wang Lian
Thank you for the clarification. You've convinced me.
Your suggestion provides a much cleaner path forward. It allows the test
to focus on the process_madvise syscall's interface—asserting the
successful return—without the flakiness of verifying side-effects that
are difficult to observe reliably. This makes the test much more robust.
I will update the patch to implement this clear assertion logic. Thank
you for guiding me to this better solution.
Best regards,
Wang Lian
Powered by blists - more mailing lists