[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170111161647.306e511a2478132ac9a3969e@linux-foundation.org>
Date: Wed, 11 Jan 2017 16:16:47 -0800
From: Andrew Morton <akpm@...ux-foundation.org>
To: David Rientjes <rientjes@...gle.com>
Cc: Vlastimil Babka <vbabka@...e.cz>,
Mel Gorman <mgorman@...hsingularity.net>,
Michal Hocko <mhocko@...nel.org>,
Jonathan Corbet <corbet@....net>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [patch v2] mm, thp: add new defer+madvise defrag option
On Tue, 10 Jan 2017 16:15:27 -0800 (PST) David Rientjes <rientjes@...gle.com> wrote:
> There is no thp defrag option that currently allows MADV_HUGEPAGE regions
> to do direct compaction and reclaim while all other thp allocations simply
> trigger kswapd and kcompactd in the background and fail immediately.
>
> The "defer" setting simply triggers background reclaim and compaction for
> all regions, regardless of MADV_HUGEPAGE, which makes it unusable for our
> userspace where MADV_HUGEPAGE is being used to indicate the application is
> willing to wait for work for thp memory to be available.
>
> The "madvise" setting will do direct compaction and reclaim for these
> MADV_HUGEPAGE regions, but does not trigger kswapd and kcompactd in the
> background for anybody else.
>
> For reasonable usage, there needs to be a mesh between the two options.
> This patch introduces a fifth mode, "defer+madvise", that will do direct
> reclaim and compaction for MADV_HUGEPAGE regions and trigger background
> reclaim and compaction for everybody else so that hugepages may be
> available in the near future.
>
> A proposal to allow direct reclaim and compaction for MADV_HUGEPAGE
> regions as part of the "defer" mode, making it a very powerful setting and
> avoids breaking userspace, was offered:
> http://marc.info/?t=148236612700003. This additional mode is a
> compromise.
>
> A second proposal to allow both "defer" and "madvise" to be selected at
> the same time was also offered: http://marc.info/?t=148357345300001.
> This is possible, but there was a concern that it might break existing
> userspaces the parse the output of the defrag mode, so the fifth option
> was introduced instead.
>
> This patch also cleans up the helper function for storing to "enabled"
> and "defrag" since the former supports three modes while the latter
> supports five and triple_flag_store() was getting unnecessarily messy.
>
> --- a/Documentation/vm/transhuge.txt
> +++ b/Documentation/vm/transhuge.txt
> @@ -110,6 +110,7 @@ MADV_HUGEPAGE region.
>
> echo always >/sys/kernel/mm/transparent_hugepage/defrag
> echo defer >/sys/kernel/mm/transparent_hugepage/defrag
> +echo defer+madvise >/sys/kernel/mm/transparent_hugepage/defrag
> echo madvise >/sys/kernel/mm/transparent_hugepage/defrag
> echo never >/sys/kernel/mm/transparent_hugepage/defrag
>
> @@ -120,10 +121,15 @@ that benefit heavily from THP use and are willing to delay the VM start
> to utilise them.
>
> "defer" means that an application will wake kswapd in the background
> -to reclaim pages and wake kcompact to compact memory so that THP is
> +to reclaim pages and wake kcompactd to compact memory so that THP is
> available in the near future. It's the responsibility of khugepaged
> to then install the THP pages later.
>
> +"defer+madvise" will enter direct reclaim and compaction like "always", but
> +only for regions that have used madvise(MADV_HUGEPAGE); all other regions
> +will wake kswapd in the background to reclaim pages and wake kcompactd to
> +compact memory so that THP is available in the near future.
> +
It would be helpful if this text were to tell the reader why they may
choose to use this option: runtime effects, advantages, when-to-use,
when-not-to-use, etc.
Powered by blists - more mailing lists