lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250711120232.13452-1-lianux.mm@gmail.com>
Date: Fri, 11 Jul 2025 20:02:00 +0800
From: wang lian <lianux.mm@...il.com>
To: lorenzo.stoakes@...cle.com
Cc: Liam.Howlett@...cle.com,
	akpm@...ux-foundation.org,
	brauner@...nel.org,
	broonie@...nel.org,
	david@...hat.com,
	gkwang@...x-info.com,
	jannh@...gle.com,
	lianux.mm@...il.com,
	linux-kernel@...r.kernel.org,
	linux-kselftest@...r.kernel.org,
	linux-mm@...ck.org,
	p1ucky0923@...il.com,
	ryncsn@...il.com,
	shuah@...nel.org,
	sj@...nel.org,
	vbabka@...e.cz,
	zijing.zhang@...ton.me,
	ziy@...dia.com
Subject: Re: [PATCH v4] selftests/mm: add process_madvise() tests

Hi Lorenzo Stoakes,

> > Hi Lorenzo Stoakes,
> >
> > >> + *
> > >> + * This test deterministically validates process_madvise() with MADV_COLLAPSE
> > >> + * on a remote process, other advices are difficult to verify reliably.
> > >> + *
> > >> + * The test verifies that a memory region in a child process, initially
> > >> + * backed by small pages, can be collapsed into a Transparent Huge Page by a
> > >> + * request from the parent. The result is verified by parsing the child's
> > >> + * /proc/<pid>/smaps file.
> > >> + */
> >
> > > This is clever and you've put a lot of effort in, but this just seems
> > > absolutely prone to flaking and you're essentially testing something that's
> > > highly automated.
> >
> > > I think you're also going way outside of the realms of testing
> > > process_madvise() and are getting into testing essentially MADV_COLLAPSE
> > > here.
> >
> > > > We have to try to keep the test specific to what it is you're testing -
> > > which is process_madvise() itself.
> >
> > > So for me, and I realise you've put a ton of work into this and I'm really
> > > sorry to say it, I think you should drop this specific test.
> >
> > > For me simply testing the remote MADV_DONTNEED is enough.
> >
> > My motivation for this complex test came from the need to verify that
> > the process_madvise operation was actually successful. Without checking
> > the outcome, the test would only validate that the syscall returns the
> > correct number of bytes, not that the advice truly took effect on the
> > target process's memory.
> >
> > For remote calls, process_madvise is intentionally limited to
> > non-destructive advice: MADV_COLD, MADV_PAGEOUT, MADV_WILLNEED,
> > and MADV_COLLAPSE. However, verifying the effects of COLD, PAGEOUT,
> > and WILLNEED is very difficult to do reliably in a selftest. This left
> > MADV_COLLAPSE as what seemed to be the only verifiable option.
> >
> > But, as you correctly pointed out, MADV_COLLAPSE is too dependent on
> > the system's THP state and prone to races with khugepaged. This is the
> > very issue I tried to work around in v4 after the v3 test failures.
> > So I think this test is necessary.
> > As for your other opinions, I completely agree.

> MADV_COLLAPSE is not a reliable test and we're going to end up with flakes. The
> implementation as-is is unreliable, and I"m not sure there's any way to make it
> not-unreliable.

> This is especially true as we change THP behaviour over time. I don't want to
> see failed test reports because of this.

> I think it might be best to simply assert that the operation succesfully
> completes without checking whether it actually executes the requested task - it
> would render this functionality completely broken if it were not to actually do
> what was requested.

> >
> >
> >
> > Best regards,
> > Wang Lian

Thank you for the clarification. You've convinced me.

Your suggestion provides a much cleaner path forward. It allows the test
to focus on the process_madvise syscall's interface—asserting the
successful return—without the flakiness of verifying side-effects that
are difficult to observe reliably. This makes the test much more robust.

I will update the patch to implement this clear assertion logic. Thank
you for guiding me to this better solution.


Best regards,
Wang Lian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ