[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGsJ_4w0SzyA50zNcGAgUc36GOEVr3L6gcZntw4ejBogC9b6+Q@mail.gmail.com>
Date: Sun, 11 Aug 2024 18:54:33 +1200
From: Barry Song <21cnbao@...il.com>
To: David Hildenbrand <david@...hat.com>
Cc: akpm@...ux-foundation.org, baolin.wang@...ux.alibaba.com,
chrisl@...nel.org, hanchuanhua@...o.com, ioworker0@...il.com,
kaleshsingh@...gle.com, kasong@...cent.com, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, ryan.roberts@....com, v-songbaohua@...o.com,
ziy@...dia.com
Subject: Re: [PATCH RFC 1/2] mm: collect the number of anon large folios
On Sun, Aug 11, 2024 at 5:20 PM Barry Song <21cnbao@...il.com> wrote:
>
> On Fri, Aug 9, 2024 at 7:22 PM David Hildenbrand <david@...hat.com> wrote:
> >
> > On 09.08.24 09:04, Barry Song wrote:
> > >>>> I would appreciate if we leave the rmap out here.
> > >>>>
> > >>>> Can't we handle that when actually freeing the folio? folio_test_anon()
> > >>>> is sticky until freed.
> > >>>
> > >>> To be clearer: we increment the counter when we set a folio anon, which
> > >>> should indeed only happen in folio_add_new_anon_rmap(). We'll have to
> > >>> ignore hugetlb here where we do it in hugetlb_add_new_anon_rmap().
> > >>>
> > >>> Then, when we free an anon folio we decrement the counter. (hugetlb
> > >>> should clear the anon flag when an anon folio gets freed back to its
> > >>> allocator -- likely that is already done).
> > >>>
> > >>
> > >> Sorry that I am talking to myself: I'm wondering if we also have to
> > >> adjust the counter when splitting a large folio to multiple
> > >> smaller-but-still-large folios.
> > >
> > > Hi David,
> > >
> > > The conceptual code is shown below. Does this make more
> > > sense to you? we have a line "mod_mthp_stat(new_order,
> > > MTHP_STAT_NR_ANON, 1 << (order - new_order));"
> > >
> > > @@ -3270,8 +3272,9 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> > > struct deferred_split *ds_queue = get_deferred_split_queue(folio);
> > > /* reset xarray order to new order after split */
> > > XA_STATE_ORDER(xas, &folio->mapping->i_pages, folio->index, new_order);
> > > - struct anon_vma *anon_vma = NULL;
> > > + bool is_anon = folio_test_anon(folio);
> > > struct address_space *mapping = NULL;
> > > + struct anon_vma *anon_vma = NULL;
> > > int order = folio_order(folio);
> > > int extra_pins, ret;
> > > pgoff_t end;
> > > @@ -3283,7 +3286,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> > > if (new_order >= folio_order(folio))
> > > return -EINVAL;
> > >
> > > - if (folio_test_anon(folio)) {
> > > + if (is_anon) {
> > > /* order-1 is not supported for anonymous THP. */
> > > if (new_order == 1) {
> > > VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> > > @@ -3323,7 +3326,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> > > if (folio_test_writeback(folio))
> > > return -EBUSY;
> > >
> > > - if (folio_test_anon(folio)) {
> > > + if (is_anon) {
> > > /*
> > > * The caller does not necessarily hold an mmap_lock that would
> > > * prevent the anon_vma disappearing so we first we take a
> > > @@ -3437,6 +3440,10 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> > > }
> > > }
> > >
> > > + if (is_anon) {
> > > + mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1);
> > > + mod_mthp_stat(new_order, MTHP_STAT_NR_ANON, 1 << (order - new_order));
> > > + }
> > > __split_huge_page(page, list, end, new_order);
> > > ret = 0;
> > > } else {
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index 408ef3d25cf5..c869d0601614 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -1039,6 +1039,7 @@ __always_inline bool free_pages_prepare(struct page *page,
> > > bool skip_kasan_poison = should_skip_kasan_poison(page);
> > > bool init = want_init_on_free();
> > > bool compound = PageCompound(page);
> > > + bool anon = PageAnon(page);
> > >
> > > VM_BUG_ON_PAGE(PageTail(page), page);
> > >
> > > @@ -1130,6 +1131,9 @@ __always_inline bool free_pages_prepare(struct page *page,
> > >
> > > debug_pagealloc_unmap_pages(page, 1 << order);
> > >
> > > + if (anon && compound)
> > > + mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1);
> > > +
> > > return true;
> >
> > I'd have placed it here, when we are already passed the "PageMappingFlags" check and
> > shouldn't have any added overhead for most !anon pages IIRC (mostly only anon/ksm pages should
> > run into that path).
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 408ef3d25cf5..a11b9dd62964 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -1079,8 +1079,11 @@ __always_inline bool free_pages_prepare(struct page *page,
> > (page + i)->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
> > }
> > }
> > - if (PageMappingFlags(page))
> > + if (PageMappingFlags(page)) {
> > + if (PageAnon(page) && compound)
> > + mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1);
> > page->mapping = NULL;
> > + }
> > if (is_check_pages_enabled()) {
> > if (free_page_is_bad(page))
> > bad++;
> >
>
> This looks better, but we're not concerned about the bad pages, correct?
> For bad pages, we're reducing the count by 1, but they aren't actually
> freed. We don't need to worry about this since it's already considered
> a bug, right?
>
> > Conceptually LGTM. We account an anon folio as long as it's anon,
> > even when still GUP-pinned after unmapping it or when temporarily
> > unmapping+remapping it during migration.
>
> right. but migration might be a problem? as the dst folio doesn't
> call folio_add_new_anon_rmap(), dst->mapping is copied from
> src. So I suspect we need the below (otherwise, src has been put
> and got -1, but dst won't get +1),
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 7e1267042a56..11ef11e59036 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1102,6 +1102,8 @@ static void migrate_folio_done(struct folio *src,
> mod_node_page_state(folio_pgdat(src), NR_ISOLATED_ANON +
> folio_is_file_lru(src),
> -folio_nr_pages(src));
>
> + mod_mthp_stat(folio_order(src), MTHP_STAT_NR_ANON, 1);
> +
> if (reason != MR_MEMORY_FAILURE)
> /* We release the page in page_handle_poison. */
> folio_put(src);
>
sorry. wrong place as migrate_folio_done() doesn't necessarily mean
migration is done. it could be "Folio was freed from under us"
if (folio_ref_count(src) == 1) {
/* Folio was freed from under us. So we are done. */
folio_clear_active(src);
folio_clear_unevictable(src);
/* free_pages_prepare() will clear PG_isolated. */
list_del(&src->lru);
migrate_folio_done(src, reason);
return MIGRATEPAGE_SUCCESS;
}
the correct place should be:
@@ -1329,6 +1326,10 @@ static int migrate_folio_move(free_folio_t
put_new_folio, unsigned long private,
if (anon_vma)
put_anon_vma(anon_vma);
folio_unlock(src);
+
+ if (folio_test_anon(src))
+ mod_mthp_stat(folio_order(src), MTHP_STAT_NR_ANON, 1);
+
migrate_folio_done(src, reason);
return rc;
Without this modification in migration code, my tests fail, anon_num can
become negative.
>
> >
> > --
> > Cheers,
> >
> > David / dhildenb
> >
Thanks
Barry
Powered by blists - more mailing lists