linux-kernel - Re: [PATCH 25/25] mm, compaction: Do not direct compact remote memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <84a7b23a-1cb7-b888-4245-6b1e829f472b@suse.cz>
Date:   Fri, 18 Jan 2019 14:51:00 +0100
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Mel Gorman <mgorman@...hsingularity.net>,
        Linux-MM <linux-mm@...ck.org>
Cc:     David Rientjes <rientjes@...gle.com>,
        Andrea Arcangeli <aarcange@...hat.com>, ying.huang@...el.com,
        kirill@...temov.name, Andrew Morton <akpm@...ux-foundation.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 25/25] mm, compaction: Do not direct compact remote memory

On 1/4/19 1:50 PM, Mel Gorman wrote:
> Remote compaction is expensive and possibly counter-productive. Locality
> is expected to often have better performance characteristics than remote
> high-order pages. For small allocations, it's expected that locality is
> generally required or fallbacks are possible. For larger allocations such
> as THP, they are forbidden at the time of writing but if __GFP_THISNODE
> is ever removed, then it would still be preferable to fallback to small
> local base pages over remote THP in the general case. kcompactd is still
> woken via kswapd so compaction happens eventually.
> 
> While this patch potentially has both positive and negative effects,
> it is best to avoid the possibility of remote compaction given the cost
> relative to any potential benefit.
> 
> Signed-off-by: Mel Gorman <mgorman@...hsingularity.net>

Generally agree with the intent, but what if there's e.g. high-order (but not
costly) kernel allocation on behalf of user process on cpu belonging to a
movable node, where the only non-movable node is node 0. It will have to keep
reclaiming until a large enough page is formed, or wait for kcompactd?
So maybe do this only for costly orders?

Also I think compaction_zonelist_suitable() should be also updated, or we might
be promising the reclaim-compact loop e.g. that we will compact after enough
reclaim, but then we won't.

> ---
>  mm/compaction.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index ae70be023b21..cc17f0c01811 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -2348,6 +2348,16 @@ enum compact_result try_to_compact_pages(gfp_t gfp_mask, unsigned int order,
>  			continue;
>  		}
>  
> +		/*
> +		 * Do not compact remote memory. It's expensive and high-order
> +		 * small allocations are expected to prefer or require local
> +		 * memory. Similarly, larger requests such as THP can fallback
> +		 * to base pages in preference to remote huge pages if
> +		 * __GFP_THISNODE is not specified
> +		 */
> +		if (zone_to_nid(zone) != zone_to_nid(ac->preferred_zoneref->zone))
> +			continue;
> +
>  		status = compact_zone_order(zone, order, gfp_mask, prio,
>  				alloc_flags, ac_classzone_idx(ac), capture);
>  		rc = max(status, rc);
>