lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231023092647.dxu45uq4lncw5asy@techsingularity.net>
Date:   Mon, 23 Oct 2023 10:26:47 +0100
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     "Huang, Ying" <ying.huang@...el.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        Arjan Van De Ven <arjan@...ux.intel.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        David Hildenbrand <david@...hat.com>,
        Johannes Weiner <jweiner@...hat.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Michal Hocko <mhocko@...e.com>,
        Pavel Tatashin <pasha.tatashin@...een.com>,
        Matthew Wilcox <willy@...radead.org>,
        Christoph Lameter <cl@...ux.com>
Subject: Re: [PATCH -V3 8/9] mm, pcp: decrease PCP high if free pages < high
 watermark

On Fri, Oct 20, 2023 at 11:30:53AM +0800, Huang, Ying wrote:
> Mel Gorman <mgorman@...hsingularity.net> writes:
> 
> > On Mon, Oct 16, 2023 at 01:30:01PM +0800, Huang Ying wrote:
> >> One target of PCP is to minimize pages in PCP if the system free pages
> >> is too few.  To reach that target, when page reclaiming is active for
> >> the zone (ZONE_RECLAIM_ACTIVE), we will stop increasing PCP high in
> >> allocating path, decrease PCP high and free some pages in freeing
> >> path.  But this may be too late because the background page reclaiming
> >> may introduce latency for some workloads.  So, in this patch, during
> >> page allocation we will detect whether the number of free pages of the
> >> zone is below high watermark.  If so, we will stop increasing PCP high
> >> in allocating path, decrease PCP high and free some pages in freeing
> >> path.  With this, we can reduce the possibility of the premature
> >> background page reclaiming caused by too large PCP.
> >> 
> >> The high watermark checking is done in allocating path to reduce the
> >> overhead in hotter freeing path.
> >> 
> >> Signed-off-by: "Huang, Ying" <ying.huang@...el.com>
> >> Cc: Andrew Morton <akpm@...ux-foundation.org>
> >> Cc: Mel Gorman <mgorman@...hsingularity.net>
> >> Cc: Vlastimil Babka <vbabka@...e.cz>
> >> Cc: David Hildenbrand <david@...hat.com>
> >> Cc: Johannes Weiner <jweiner@...hat.com>
> >> Cc: Dave Hansen <dave.hansen@...ux.intel.com>
> >> Cc: Michal Hocko <mhocko@...e.com>
> >> Cc: Pavel Tatashin <pasha.tatashin@...een.com>
> >> Cc: Matthew Wilcox <willy@...radead.org>
> >> Cc: Christoph Lameter <cl@...ux.com>
> >> ---
> >>  include/linux/mmzone.h |  1 +
> >>  mm/page_alloc.c        | 33 +++++++++++++++++++++++++++++++--
> >>  2 files changed, 32 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> >> index ec3f7daedcc7..c88770381aaf 100644
> >> --- a/include/linux/mmzone.h
> >> +++ b/include/linux/mmzone.h
> >> @@ -1018,6 +1018,7 @@ enum zone_flags {
> >>  					 * Cleared when kswapd is woken.
> >>  					 */
> >>  	ZONE_RECLAIM_ACTIVE,		/* kswapd may be scanning the zone. */
> >> +	ZONE_BELOW_HIGH,		/* zone is below high watermark. */
> >>  };
> >>  
> >>  static inline unsigned long zone_managed_pages(struct zone *zone)
> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >> index 8382ad2cdfd4..253fc7d0498e 100644
> >> --- a/mm/page_alloc.c
> >> +++ b/mm/page_alloc.c
> >> @@ -2407,7 +2407,13 @@ static int nr_pcp_high(struct per_cpu_pages *pcp, struct zone *zone,
> >>  		return min(batch << 2, pcp->high);
> >>  	}
> >>  
> >> -	if (pcp->count >= high && high_min != high_max) {
> >> +	if (high_min == high_max)
> >> +		return high;
> >> +
> >> +	if (test_bit(ZONE_BELOW_HIGH, &zone->flags)) {
> >> +		pcp->high = max(high - (batch << pcp->free_factor), high_min);
> >> +		high = max(pcp->count, high_min);
> >> +	} else if (pcp->count >= high) {
> >>  		int need_high = (batch << pcp->free_factor) + batch;
> >>  
> >>  		/* pcp->high should be large enough to hold batch freed pages */
> >> @@ -2457,6 +2463,10 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp,
> >>  	if (pcp->count >= high) {
> >>  		free_pcppages_bulk(zone, nr_pcp_free(pcp, batch, high, free_high),
> >>  				   pcp, pindex);
> >> +		if (test_bit(ZONE_BELOW_HIGH, &zone->flags) &&
> >> +		    zone_watermark_ok(zone, 0, high_wmark_pages(zone),
> >> +				      ZONE_MOVABLE, 0))
> >> +			clear_bit(ZONE_BELOW_HIGH, &zone->flags);
> >>  	}
> >>  }
> >>  
> >
> > This is a relatively fast path and freeing pages should not need to check
> > watermarks.
> 
> Another stuff that mitigate the overhead is that the watermarks checking
> only occurs when we free pages from PCP to buddy.  That is, in most
> cases, every 63 page freeing.
> 

True

> > While the overhead is mitigated because it applies only when
> > the watermark is below high, that is also potentially an unbounded condition
> > if a workload is sized precisely enough. Why not clear this bit when kswapd
> > is going to sleep after reclaiming enough pages in a zone?
> 
> IIUC, if the number of free pages is kept larger than the low watermark,
> then kswapd will have no opportunity to be waken up even if the number
> of free pages was ever smaller than the high watermark.
> 

Also true and I did think of that later. I guess it's ok, the chances
are that the series overall offsets any micro-costs like this so I'm
happy. If, for some reason, this overhead is noticable (doubtful), then
it can be revisted.

Thanks.

-- 
Mel Gorman
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ