lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yduuyrk/AZG717Hs@google.com>
Date:   Sun, 9 Jan 2022 20:58:02 -0700
From:   Yu Zhao <yuzhao@...gle.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Andi Kleen <ak@...ux.intel.com>,
        Catalin Marinas <catalin.marinas@....com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Hillf Danton <hdanton@...a.com>, Jens Axboe <axboe@...nel.dk>,
        Jesse Barnes <jsbarnes@...gle.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>,
        Matthew Wilcox <willy@...radead.org>,
        Mel Gorman <mgorman@...e.de>,
        Michael Larabel <Michael@...haellarabel.com>,
        Rik van Riel <riel@...riel.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Will Deacon <will@...nel.org>,
        Ying Huang <ying.huang@...el.com>,
        linux-arm-kernel@...ts.infradead.org, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        page-reclaim@...gle.com, x86@...nel.org,
        Konstantin Kharlamov <Hi-Angel@...dex.ru>
Subject: Re: [PATCH v6 6/9] mm: multigenerational lru: aging

On Fri, Jan 07, 2022 at 10:00:31AM +0100, Michal Hocko wrote:
> On Fri 07-01-22 09:55:09, Michal Hocko wrote:
> [...]
> > > In this case, lru_gen_mm_walk is small (160 bytes); it's per direct
> > > reclaimer; and direct reclaimers rarely come here, i.e., only when
> > > kswapd can't keep up in terms of the aging, which is similar to the
> > > condition where the inactive list is empty for the active/inactive
> > > lru.
> > 
> > Well, this is not a strong argument to be honest. Kswapd being stuck
> > and the majority of the reclaim being done in the direct reclaim
> > context is a situation I have seen many many times.
> 
> Also do not forget that memcg reclaim is effectivelly only direct
> reclaim. Not that the memcg reclaim indicates a global memory shortage
> but it can add up and race with the global reclaim as well.

I don't dispute any of the above, and I probably don't like this code
more than you do.

But let's not forget the purposes of PF_MEMALLOC, besides preventing
recursive reclaims, include letting reclaim dip into reserves so that
it can make more free memory. So I think it's acceptable if the
following conditions are met:
1. The allocation size is small.
2. The number of allocations is bounded.
3. Its failure doesn't stall reclaim.
And it'd be nice if
4. The allocation happens rarely, e.g., slow path only.

The code in question meets all of them.

1. This allocation is 160 bytes.
2. It's bounded by the number of page table walkers which, in the
   worst, is same as the number of mm_struct's.
3. Most importantly, its failure doesn't stall the aging. The aging
   will fallback to the rmap-based function lru_gen_look_around().
   But this function only gathers the accessed bit from at most 64
   PTEs, meaning it's less efficient (retains ~80% performance gains).
4. This allocation is rare, i.e., only when the aging is required,
   which is similar to the low inactive case for the active/inactive
   lru.

The bottom line is I can try various optimizations, e.g., preallocate
a few buffers for a limited number of page walkers and if this number
has been reached, fallback to the rmap-based function. But I have yet
to see evidence that calls for additional complexity.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ