lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK1f24k+=Sskotbct+yGxpDKNv=qyXPkww5i2kaqfzwaUVO_GQ@mail.gmail.com>
Date: Fri, 19 Jan 2024 10:03:05 +0800
From: Lance Yang <ioworker0@...il.com>
To: Michal Hocko <mhocko@...e.com>
Cc: akpm@...ux-foundation.org, zokeefe@...gle.com, david@...hat.com, 
	songmuchun@...edance.com, shy828301@...il.com, peterx@...hat.com, 
	mknyszek@...gle.com, minchan@...nel.org, linux-mm@...ck.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/1] mm/madvise: add MADV_F_COLLAPSE_LIGHT to process_madvise()

Hey Michal,

Thanks for taking the time to review!

On Thu, Jan 18, 2024 at 9:40 PM Michal Hocko <mhocko@...e.com> wrote:
>
> On Thu 18-01-24 20:03:46, Lance Yang wrote:
> [...]
>
> before we discuss the semantic, let's focus on the usecase.
>
> > Use Cases
> >
> > An immediate user of this new functionality is the Go runtime heap allocator
> > that manages memory in hugepage-sized chunks. In the past, whether it was a
> > newly allocated chunk through mmap() or a reused chunk released by
> > madvise(MADV_DONTNEED), the allocator attempted to eagerly back memory with
> > huge pages using madvise(MADV_HUGEPAGE)[2] and madvise(MADV_COLLAPSE)[3]
> > respectively. However, both approaches resulted in performance issues; for
> > both scenarios, there could be entries into direct reclaim and/or compaction,
> > leading to unpredictable stalls[4]. Now, the allocator can confidently use
> > process_madvise(MADV_F_COLLAPSE_LIGHT) to attempt the allocation of huge pages.
>
> IIUC the primary reason is the cost of the huge page allocation which
> can be really high if the memory is heavily fragmented and it is called
> synchronously from the process directly, correct? Can that be worked

Yes, that's correct.

> around by process_madvise and performing the operation from a different
> context? Are there any other reasons to have a different mode?

In latency-sensitive scenarios, some applications aim to enhance performance
by utilizing huge pages as much as possible. At the same time, in case of
allocation failure, they prefer a quick return without triggering direct memory
reclamation and compaction.

>
> I mean I can think of a more relaxed (opportunistic) MADV_COLLAPSE -
> e.g. non blocking one to make sure that the caller doesn't really block
> on resource contention (be it locks or memory availability) because that
> matches our non-blocking interface in other areas but having a LIGHT
> operation sounds really vague and the exact semantic would be
> implementation specific and might change over time. Non-blocking has a
> clear semantic but it is not really clear whether that is what you
> really need/want.

Could you provide me with some suggestions regarding the naming of a
more relaxed (opportunistic) MADV_COLLAPSE?

Thanks again for your review and your suggestion!
Lance

>
> > [1] https://github.com/torvalds/linux/commit/7d8faaf155454f8798ec56404faca29a82689c77
> > [2] https://github.com/golang/go/commit/8fa9e3beee8b0e6baa7333740996181268b60a3a
> > [3] https://github.com/golang/go/commit/9f9bb26880388c5bead158e9eca3be4b3a9bd2af
> > [4] https://github.com/golang/go/issues/63334
> >
> > [v1] https://lore.kernel.org/lkml/20240117050217.43610-1-ioworker0@gmail.com/
> --
> Michal Hocko
> SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ