linux-kernel - Re: [PATCHv2] mm: skip CMA pages when they are not available

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87mt31sj57.fsf@yhuang6-desk2.ccr.corp.intel.com>
Date:   Fri, 21 Apr 2023 17:02:12 +0800
From:   "Huang, Ying" <ying.huang@...el.com>
To:     Zhaoyang Huang <huangzhaoyang@...il.com>
Cc:     "zhaoyang.huang" <zhaoyang.huang@...soc.com>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, ke.wang@...soc.com
Subject: Re: [PATCHv2] mm: skip CMA pages when they are not available

Zhaoyang Huang <huangzhaoyang@...il.com> writes:

> On Fri, Apr 21, 2023 at 2:47 PM Huang, Ying <ying.huang@...el.com> wrote:
>>
>> "zhaoyang.huang" <zhaoyang.huang@...soc.com> writes:
>>
>> > From: Zhaoyang Huang <zhaoyang.huang@...soc.com>
>> >
>> > This patch fixes unproductive reclaiming of CMA pages by skipping them when they
>> > are not available for current context. It is arise from bellowing OOM issue, which
>> > caused by large proportion of MIGRATE_CMA pages among free pages. There has been
>> > commit(168676649) to fix it by trying CMA pages first instead of fallback in
>> > rmqueue. I would like to propose another one from reclaiming perspective.
>> >
>> > 04166 < 4> [   36.172486] [03-19 10:05:52.172] ActivityManager: page allocation failure: order:0, mode:0xc00(GFP_NOIO), nodemask=(null),cpuset=foreground,mems_allowed=0
>> > 0419C < 4> [   36.189447] [03-19 10:05:52.189] DMA32: 0*4kB 447*8kB (C) 217*16kB (C) 124*32kB (C) 136*64kB (C) 70*128kB (C) 22*256kB (C) 3*512kB (C) 0*1024kB 0*2048kB 0*4096kB = 35848kB
>> > 0419D < 4> [   36.193125] [03-19 10:05:52.193] Normal: 231*4kB (UMEH) 49*8kB (MEH) 14*16kB (H) 13*32kB (H) 8*64kB (H) 2*128kB (H) 0*256kB 1*512kB (H) 0*1024kB 0*2048kB 0*4096kB = 3236kB
>> >       ......
>> > 041EA < 4> [   36.234447] [03-19 10:05:52.234] SLUB: Unable to allocate memory on node -1, gfp=0xa20(GFP_ATOMIC)
>> > 041EB < 4> [   36.234455] [03-19 10:05:52.234] cache: ext4_io_end, object size: 64, buffer size: 64, default order: 0, min order: 0
>> > 041EC < 4> [   36.234459] [03-19 10:05:52.234] node 0: slabs: 53,objs: 3392, free: 0
>>
>> From the above description, you are trying to resolve an issue that has
>> been resolved already.  If so, why do we need your patch?  What is the
>> issue it try to resolve in current upstream kernel?
>
> Please consider this bellowing sequence as __perform_reclaim() return
> with reclaiming 32 CMA pages successfully and then lead to
> get_page_from_freelist failure if MIGRATE_CMA is NOT over 1/2 number
> of free pages which will then unreserve H pageblocks and drain percpu
> pageset. right? Furthermore, this could also introduce OOM as
> direct_reclaim is the final guard for alloc_pages.
>
> *did_some_progress = __perform_reclaim(gfp_mask, order, ac);
>
> retry:
> page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac);
>
> if (!page && !drained) {
> unreserve_highatomic_pageblock(ac, false);
> drain_all_pages(NULL);
> drained = true;
> goto retry;
> }

If you think OOM can be triggered, please try to reproduce it.

Best Regards,
Huang, Ying

> return page;
>>
>> At the first glance, I don't think your patch doesn't make sense.  But
>> you really need to show the value of the patch.
>>
>> Best Regards,
>> Huang, Ying
>>
>> > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@...soc.com>
>> > ---
>> > v2: update commit message and fix build error when CONFIG_CMA is not set
>> > ---
>> > ---
>> >  mm/vmscan.c | 15 +++++++++++++--
>> >  1 file changed, 13 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/mm/vmscan.c b/mm/vmscan.c
>> > index bd6637f..19fb445 100644
>> > --- a/mm/vmscan.c
>> > +++ b/mm/vmscan.c
>> > @@ -2225,10 +2225,16 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
>> >       unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
>> >       unsigned long skipped = 0;
>> >       unsigned long scan, total_scan, nr_pages;
>> > +     bool cma_cap = true;
>> > +     struct page *page;
>> >       LIST_HEAD(folios_skipped);
>> >
>> >       total_scan = 0;
>> >       scan = 0;
>> > +     if ((IS_ENABLED(CONFIG_CMA)) && !current_is_kswapd()
>> > +             && (gfp_migratetype(sc->gfp_mask) != MIGRATE_MOVABLE))
>> > +             cma_cap = false;
>> > +
>> >       while (scan < nr_to_scan && !list_empty(src)) {
>> >               struct list_head *move_to = src;
>> >               struct folio *folio;
>> > @@ -2239,12 +2245,17 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
>> >               nr_pages = folio_nr_pages(folio);
>> >               total_scan += nr_pages;
>> >
>> > -             if (folio_zonenum(folio) > sc->reclaim_idx) {
>> > +             page = &folio->page;
>> > +
>> > +             if ((folio_zonenum(folio) > sc->reclaim_idx)
>> > +#ifdef CONFIG_CMA
>> > +                     || (get_pageblock_migratetype(page) == MIGRATE_CMA && !cma_cap)
>> > +#endif
>> > +             ) {
>> >                       nr_skipped[folio_zonenum(folio)] += nr_pages;
>> >                       move_to = &folios_skipped;
>> >                       goto move;
>> >               }
>> > -
>> >               /*
>> >                * Do not count skipped folios because that makes the function
>> >                * return with no isolated folios if the LRU mostly contains