lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230719090518.67g7hascnfcly6hk@techsingularity.net>
Date:   Wed, 19 Jul 2023 10:05:18 +0100
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     "Huang, Ying" <ying.huang@...el.com>
Cc:     Michal Hocko <mhocko@...e.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        Arjan Van De Ven <arjan@...ux.intel.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        David Hildenbrand <david@...hat.com>,
        Johannes Weiner <jweiner@...hat.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Pavel Tatashin <pasha.tatashin@...een.com>,
        Matthew Wilcox <willy@...radead.org>
Subject: Re: [RFC 2/2] mm: alloc/free depth based PCP high auto-tuning

On Wed, Jul 19, 2023 at 01:59:00PM +0800, Huang, Ying wrote:
> > The big remaaining corner case to watch out for is where the sum
> > of the boosted pcp->high exceeds the low watermark.  If that should ever
> > happen then potentially a premature OOM happens because the watermarks
> > are fine so no reclaim is active but no pages are available. It may even
> > be the case that the sum of pcp->high should not exceed *min* as that
> > corner case means that processes may prematurely enter direct reclaim
> > (not as bad as OOM but still bad).
> 
> Sorry, I don't understand this.  When pages are moved from buddy to PCP,
> zone NR_FREE_PAGES will be decreased in rmqueue_bulk().  That is, pages
> in PCP will be counted as used instead of free.  And, in
> zone_watermark_ok*() and zone_watermark_fast(), zone NR_FREE_PAGES is
> used to check watermark.  So, if my understanding were correct, if the
> number of pages in PCP is larger than low/min watermark, we can still
> trigger reclaim.  Whether is my understanding correct?
> 

You're right, I didn't check the timing of the accounting and all that
occurred to me was "the timing of when watermarks trigger kswapd or
direct reclaim may change as a result of PCP adaptive resizing". Even
though I got the timing wrong, the shape of the problem just changes.
I suspect that excessively large PCP high relative to the watermarks may
mean that reclaim happens prematurely if too many pages are pinned by PCP
pages as the zone free pages approaches the watermark. While disabling
the adaptive resizing during reclaim will limit the worst of the problem,
it may still be the case that kswapd is woken early simply because there
are enough CPUs pinning pages in PCP lists. Similarly, depending on the
size of pcp->high and the gap between the watermarks, it's possible for
direct reclaim to happen prematurely. I could still be wrong because I'm
not thinking the problem through fully, examining the code or thinking
about the implementation. It's simply worth keeping in mind the impact
elevated PCP high values has on the timing of watermarks failing. If it's
complex enough, it may be necessary to have a separate patch dealing with
the impact of elevated pcp->high on watermarks.

-- 
Mel Gorman
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ