[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMZfGtX7vmmO5CzcJAdxa6bRsqz6J48ZdXZ19pGmks_o-3g2bg@mail.gmail.com>
Date: Tue, 21 Sep 2021 21:46:06 +0800
From: Muchun Song <songmuchun@...edance.com>
To: Barry Song <21cnbao@...il.com>
Cc: Mike Kravetz <mike.kravetz@...cle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Oscar Salvador <osalvador@...e.de>,
Michal Hocko <mhocko@...e.com>,
Barry Song <song.bao.hua@...ilicon.com>,
David Hildenbrand <david@...hat.com>,
Chen Huang <chenhuang5@...wei.com>,
"Bodeddula, Balasubramaniam" <bodeddub@...zon.com>,
Jonathan Corbet <corbet@....net>,
Matthew Wilcox <willy@...radead.org>,
Xiongchun duan <duanxiongchun@...edance.com>,
fam.zheng@...edance.com, Muchun Song <smuchun@...il.com>,
Qi Zheng <zhengqi.arch@...edance.com>,
linux-doc@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>
Subject: Re: [PATCH RESEND v2 1/4] mm: hugetlb: free the 2nd vmemmap page
associated with each HugeTLB page
On Tue, Sep 21, 2021 at 8:11 PM Barry Song <21cnbao@...il.com> wrote:
>
> On Tue, Sep 21, 2021 at 10:23 PM Muchun Song <songmuchun@...edance.com> wrote:
> >
> > On Sat, Sep 18, 2021 at 6:06 PM Muchun Song <songmuchun@...edance.com> wrote:
> > >
> > > On Sat, Sep 18, 2021 at 12:39 PM Barry Song <21cnbao@...il.com> wrote:
> > > >
> > > > On Sat, Sep 18, 2021 at 12:08 AM Muchun Song <songmuchun@...edance.com> wrote:
> > > > >
> > > > > Currently, we only free 6 vmemmap pages associated with a 2MB HugeTLB
> > > > > page. However, we can remap all tail vmemmap pages to the page frame
> > > > > mapped to with the head vmemmap page. Finally, we can free 7 vmemmap
> > > > > pages for a 2MB HugeTLB page. It is a fine gain (e.g. we can save
> > > > > extra 2GB memory when there is 1TB HugeTLB pages in the system
> > > > > compared with the current implementation).
> > > > >
> > > > > But the head vmemmap page is not freed to the buddy allocator and all
> > > > > tail vmemmap pages are mapped to the head vmemmap page frame. So we
> > > > > can see more than one struct page struct with PG_head (e.g. 8 per 2 MB
> > > > > HugeTLB page) associated with each HugeTLB page. We should adjust
> > > > > compound_head() to make it returns the real head struct page when the
> > > > > parameter is the tail struct page but with PG_head flag.
> > > > >
> > > > > Signed-off-by: Muchun Song <songmuchun@...edance.com>
> > > > > ---
> > > > > Documentation/admin-guide/kernel-parameters.txt | 2 +-
> > > > > include/linux/page-flags.h | 75 +++++++++++++++++++++++--
> > > > > mm/hugetlb_vmemmap.c | 60 +++++++++++---------
> > > > > mm/sparse-vmemmap.c | 21 +++++++
> > > > > 4 files changed, 126 insertions(+), 32 deletions(-)
> > > > >
> > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > > index bdb22006f713..a154a7b3b9a5 100644
> > > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > > @@ -1606,7 +1606,7 @@
> > > > > [KNL] Reguires CONFIG_HUGETLB_PAGE_FREE_VMEMMAP
> > > > > enabled.
> > > > > Allows heavy hugetlb users to free up some more
> > > > > - memory (6 * PAGE_SIZE for each 2MB hugetlb page).
> > > > > + memory (7 * PAGE_SIZE for each 2MB hugetlb page).
> > > > > Format: { on | off (default) }
> > > > >
> > > > > on: enable the feature
> > > > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> > > > > index 8e1d97d8f3bd..7b1a918ebd43 100644
> > > > > --- a/include/linux/page-flags.h
> > > > > +++ b/include/linux/page-flags.h
> > > > > @@ -184,13 +184,64 @@ enum pageflags {
> > > > >
> > > > > #ifndef __GENERATING_BOUNDS_H
> > > > >
> > > > > +#ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP
> > > > > +extern bool hugetlb_free_vmemmap_enabled;
> > > > > +
> > > > > +/*
> > > > > + * If the feature of freeing some vmemmap pages associated with each HugeTLB
> > > > > + * page is enabled, the head vmemmap page frame is reused and all of the tail
> > > > > + * vmemmap addresses map to the head vmemmap page frame (furture details can
> > > > > + * refer to the figure at the head of the mm/hugetlb_vmemmap.c). In other
> > > > > + * word, there are more than one page struct with PG_head associated with each
> > > > > + * HugeTLB page. We __know__ that there is only one head page struct, the tail
> > > > > + * page structs with PG_head are fake head page structs. We need an approach
> > > > > + * to distinguish between those two different types of page structs so that
> > > > > + * compound_head() can return the real head page struct when the parameter is
> > > > > + * the tail page struct but with PG_head.
> > > > > + *
> > > > > + * The page_head_if_fake() returns the real head page struct iff the @page may
> > > > > + * be fake, otherwise, returns the @page if it cannot be a fake page struct.
> > > > > + */
> > > > > +static __always_inline const struct page *page_head_if_fake(const struct page *page)
> > > > > +{
> > > > > + if (!hugetlb_free_vmemmap_enabled)
> > > > > + return page;
> > > > > +
> > > > > + /*
> > > > > + * Only addresses aligned with PAGE_SIZE of struct page may be fake head
> > > > > + * struct page. The alignment check aims to avoid access the fields (
> > > > > + * e.g. compound_head) of the @page[1]. It can avoid touch a (possibly)
> > > > > + * cold cacheline in some cases.
> > > > > + */
> > > > > + if (IS_ALIGNED((unsigned long)page, PAGE_SIZE) &&
> > > > > + test_bit(PG_head, &page->flags)) {
> > > > > + /*
> > > > > + * We can safely access the field of the @page[1] with PG_head
> > > > > + * because the @page is a compound page composed with at least
> > > > > + * two contiguous pages.
> > > > > + */
> > > > > + unsigned long head = READ_ONCE(page[1].compound_head);
> > > > > +
> > > > > + if (likely(head & 1))
> > > > > + return (const struct page *)(head - 1);
> > > > > + }
> > > > > +
> > > > > + return page;
> > > > > +}
> > > > > +#else
> > > > > +static __always_inline const struct page *page_head_if_fake(const struct page *page)
> > > > > +{
> > > > > + return page;
> > > > > +}
> > > > > +#endif
> > > > > +
> > > > > static inline unsigned long _compound_head(const struct page *page)
> > > > > {
> > > > > unsigned long head = READ_ONCE(page->compound_head);
> > > > >
> > > > > if (unlikely(head & 1))
> > > > > return head - 1;
> > > > > - return (unsigned long)page;
> > > > > + return (unsigned long)page_head_if_fake(page);
> > > >
> > > > hard to read. page_head_if_fake, what is the other side of
> > > > page_head_if_not_fake?
> > >
> > > 1) return itself if the @page is not a fake head page.
> > > 2) return head page if @page is a fake head page.
> > >
> > > So I want to express that page_head_if_fake returns a
> > > head page only and only if the parameter of @page is a
> > > fake head page. Otherwise, it returns itself.
> > >
> > > > I would expect something like
> > > > page_to_page_head()
> > > > or
> > > > get_page_head()
> > > >
> > >
> > > Those names seem to be not appropriate as well, because
> > > its functionality does not make sure it can return a head
> > > page. If the parameter is a head page, it definitely
> > > returns a head page, otherwise, it may return itself which
> > > may be a tail page.
> > >
> > > From this point of view, I still prefer page_head_if_fake.
> > >
> > > > Anyway, I am not quite sure what is the best name. but page_head_if_fake(page)
> > > > sounds odd to me. just like the things have two sides, but if_fake presents
> > > > one side only.
> > >
> > > If others have any ideas, comments are welcome.
> > >
> > > >
> > > > > }
> > > > >
> > > > > #define compound_head(page) ((typeof(page))_compound_head(page))
> > > > > @@ -225,12 +276,14 @@ static inline unsigned long _compound_head(const struct page *page)
> > > > >
> > > > > static __always_inline int PageTail(struct page *page)
> > > > > {
> > > > > - return READ_ONCE(page->compound_head) & 1;
> > > > > + return READ_ONCE(page->compound_head) & 1 ||
> > > > > + page_head_if_fake(page) != page;
> > > >
> > > > i would expect a wrapper like:
> > > > page_is_fake_head()
> > >
> > > Good point. Will do.
> > >
> > > >
> > > > and the above page_to_page_head() can leverage the wrapper.
> > > > here too.
> > > >
> > > > > }
> > > > >
> > > > > static __always_inline int PageCompound(struct page *page)
> > > > > {
> > > > > - return test_bit(PG_head, &page->flags) || PageTail(page);
> > > > > + return test_bit(PG_head, &page->flags) ||
> > > > > + READ_ONCE(page->compound_head) & 1;
> > > >
> > > > hard to read. could it be something like the below?
> > > > return PageHead(page) || PageTail(page);
> > > >
> > > > or do we really need to change this function? even a fake head still has
> > > > the true test_bit(PG_head, &page->flags), though it is not a real head, it
> > > > is still a pagecompound, right?
> > >
> > > Right. PageCompound() can not be changed. It is odd but
> > > efficient because calling page_head_if_fake is eliminated.
> > > So I select performance not readability. I'm not sure if it's
> > > worth it.
> >
> > In order to improve readability, I'll introduce 3 helpers as follows.
> >
> > 1) page_head_or_fake(), which returns true for the head page
> > or fake head page.
> > 2) page_head_is_fake(), which returns true for fake head page.
> > 3) page_tail_not_fake_head(), which returns true for the tail page
> > except the fake head page.
> >
> > In the end, PageHead(), PageTail() and PageCompound() become
> > the following.
> >
> > static __always_inline int PageHead(struct page *page)
> > {
> > return page_head_or_fake(page) && !page_head_is_fake(page);
> > }
> >
> > static __always_inline int PageTail(struct page *page)
> > {
> > return page_tail_not_fake_head(page) || page_head_is_fake(page);
> > }
> >
> > static __always_inline int PageCompound(struct page *page)
> > {
> > return page_head_or_fake(page) || page_tail_not_fake_head(page);
> > }
> >
> > Do those look more readable?
> >
>
> still not good enough. After a second thought, page_head_if_fake seems
> to have the best performance though this function returns an odd value.
> i just made a little bit refine on your code in doc:
Right. page_head_if_fake is the choice for performance.
>
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 2c0d11e71e26..240c2fca13c7 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -197,8 +197,9 @@ extern bool hugetlb_free_vmemmap_enabled;
> * compound_head() can return the real head page struct when the parameter is
> * the tail page struct but with PG_head.
> *
> - * The page_head_if_fake() returns the real head page struct iff the @page may
> - * be fake, otherwise, returns the @page if it cannot be a fake page struct.
> + * The page_head_if_fake() returns the real head page struct if the @page is
> + * fake page_head, otherwise, returns @page which can either be a true page_
> + * head or tail.
> */
Good annotation.
> static __always_inline const struct page *page_head_if_fake(const
> struct page *page)
> {
> @@ -226,6 +227,12 @@ static __always_inline const struct page
> *page_head_if_fake(const struct page *p
>
> return page;
> }
> +
> +static __always_inline const struct page *page_is_fake_head(const
> struct page *page)
> +{
> + return page_head_if_fake(page) != page;
> +}
> +
> #else
> static __always_inline const struct page *page_head_if_fake(const
> struct page *page)
> {
> @@ -247,7 +254,7 @@ static inline unsigned long _compound_head(const
> struct page *page)
> static __always_inline int PageTail(struct page *page)
> {
> return READ_ONCE(page->compound_head) & 1 ||
> - page_head_if_fake(page) != page;
> + page_is_fake_head(page);
> }
Yeah, this makes PageTail more readable. In your previous thread,
you proposed that why not use PageTail in PageCompound directly
to improve code readability. So I want to introduce 2 more helpers
besides page_is_fake_head().
static __always_inline int page_tail_not_fake_head(struct page *page)
{
return READ_ONCE(page->compound_head) & 1;
}
static __always_inline int page_head_or_fake(struct page *page)
{
return test_bit(PG_head, &page->flags);
}
Then PageTail() and PageCompound() change to the following.
static __always_inline int PageTail(struct page *page)
{
return page_tail_not_fake_head(page) || page_is_fake_head(page);
}
static __always_inline int PageCompound(struct page *page)
{
return page_head_or_fake(page) || page_tail_not_fake_head(page);
}
>From the point of names of helpers, they act as self-annotation.
So I think PageTail and PageCompound become readable
as well. But you said "still not good enough". Is it because of
the names of helpers or introducing more complexity?
Thanks.
Powered by blists - more mailing lists