[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZK060sMG0GfC5gUS@dhcp22.suse.cz>
Date: Tue, 11 Jul 2023 13:19:46 +0200
From: Michal Hocko <mhocko@...e.com>
To: Huang Ying <ying.huang@...el.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Arjan Van De Ven <arjan@...ux.intel.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Mel Gorman <mgorman@...hsingularity.net>,
Vlastimil Babka <vbabka@...e.cz>,
David Hildenbrand <david@...hat.com>,
Johannes Weiner <jweiner@...hat.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Pavel Tatashin <pasha.tatashin@...een.com>,
Matthew Wilcox <willy@...radead.org>
Subject: Re: [RFC 2/2] mm: alloc/free depth based PCP high auto-tuning
On Mon 10-07-23 14:53:25, Huang Ying wrote:
> To auto-tune PCP high for each CPU automatically, an
> allocation/freeing depth based PCP high auto-tuning algorithm is
> implemented in this patch.
>
> The basic idea behind the algorithm is to detect the repetitive
> allocation and freeing pattern with short enough period (about 1
> second). The period needs to be short to respond to allocation and
> freeing pattern changes quickly and control the memory wasted by
> unnecessary caching.
1s is an ethernity from the allocation POV. Is a time based sampling
really a good choice? I would have expected a natural allocation/freeing
feedback mechanism. I.e. double the batch size when the batch is
consumed and it requires to be refilled and shrink it under memory
pressure (GFP_NOWAIT allocation fails) or when the surplus grows too
high over batch (e.g. twice as much). Have you considered something as
simple as that?
Quite honestly I am not sure time based approach is a good choice
because memory consumptions tends to be quite bulky (e.g. application
starts or workload transitions based on requests).
> To detect the repetitive allocation and freeing pattern, the
> alloc/free depth is calculated for each tuning period (1 second) on
> each CPU. To calculate the alloc/free depth, we track the alloc
> count. Which increases for page allocation from PCP and decreases for
> page freeing to PCP. The alloc depth is the maximum alloc count
> difference between the later large value and former small value.
> While, the free depth is the maximum alloc count difference between
> the former large value and the later small value.
>
> Then, the average alloc/free depth in multiple tuning periods is
> calculated, with the old alloc/free depth decay in the average
> gradually.
>
> Finally, the PCP high is set to be the smaller value of average alloc
> depth and average free depth, after clamped between the default and
> the max PCP high. In this way, pure allocation or freeing will not
> enlarge the PCP high because PCP doesn't help.
>
> We have tested the algorithm with several workloads on Intel's
> 2-socket server machines.
How does this scheme deal with memory pressure?
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists