[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGsJ_4wD-NiquhnhK_6TGEF4reTGO9pVNGyYBnYZ1inVFc40WQ@mail.gmail.com>
Date: Sun, 17 Mar 2024 18:47:42 +1300
From: Barry Song <21cnbao@...il.com>
To: 刘海龙(LaoLiu) <liuhailong@...o.com>
Cc: "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>, "nathan@...nel.org" <nathan@...nel.org>,
"ndesaulniers@...gle.com" <ndesaulniers@...gle.com>, "trix@...hat.com" <trix@...hat.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"llvm@...ts.linux.dev" <llvm@...ts.linux.dev>, "surenb@...gle.com" <surenb@...gle.com>,
"zhaoyang.huang@...soc.com" <zhaoyang.huang@...soc.com>,
"quic_charante@...cinc.com" <quic_charante@...cinc.com>, "yuzhao@...gle.com" <yuzhao@...gle.com>
Subject: Re: [PATCH v2] Revert "mm: skip CMA pages when they are not available"
On Fri, Mar 15, 2024 at 11:44 PM 刘海龙(LaoLiu) <liuhailong@...o.com> wrote:
>
> On 2024/3/15 16:46, Barry Song wrote:
> > On Fri, Mar 15, 2024 at 9:18 PM <liuhailong@...o.com> wrote:
> >>
> >> From: "Hailong.Liu" <liuhailong@...o.com>
> >>
> >> This reverts
> >> commit b7108d66318a ("Multi-gen LRU: skip CMA pages when they are not eligible")
> >> commit 5da226dbfce3 ("mm: skip CMA pages when they are not available")
> >
> > Reverting these changes seems inappropriate since the original patches
> > did improve most cases.
> > While doing direct reclamation without MOVABLE flags, scanning cma
> > folio doesn't help at all.
> >
>
> Absolutely, but without this patch it give chance to reclaim cma pages to freelist,
> other tasks without this flag would work normally without holding too much time
> on lru lock and lru size decrease。 it also trigger memory psi event to allow
> admin do something to release memory.
>
> >>
> >> skip_cma may cause system not responding. if cma pages is large in lru_list
> >> and system is in lowmemory, many tasks would direct reclaim and waste
> >> cpu time to isolate_lru_pages and return.
> >>
> >> Test this patch on android-5.15 8G device
> >> reproducer:
> >> - cma_declare_contiguous 3G pages
> >> - set /proc/sys/vm/swappiness 0 to enable direct_reclaim reclaim file
> >> only.
> >> - run a memleak process in userspace
> >
> >
> > I assume that scanning cma folio provides additional opportunities to reclaim
> > anonymous memory, even though it is ineffective for allocating NON-MOVABLE
> > folios. Consequently, we alleviate pressure for future allocations of anonymous
> > folios. Moreover, setting swappiness=0 is a rare scenario. Instead of entirely
> > removing the mechanism, could we detect such corner cases and handle them
> > differently.
> >
> > Otherwise, I don't see any chance of this being acknowledged.
>
> It happens in internal aging test. open camera and record 8K video which use
> more than 2Gib dmabuf pages. without this patch the devices would kill camera process,
> but the system hang with it.
>
> In fact, in my opinion the issue of this patch increase the scanning lru time
> and reclaim nothing. Do you have a better solution ?
rather than aggressively reverting the original patch which might help lots of
cases, could we just detect the corner cases and fix them , for example, by
something as the belows,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 96147c4536e4..2f75ad9870c1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -147,6 +147,9 @@ struct scan_control {
/* Always discard instead of demoting to lower tier memory */
unsigned int no_demotion:1;
+ /* no skipping cma */
+ unsigned int no_skip_cma:1;
+
/* Allocation order */
s8 order;
@@ -1595,7 +1598,7 @@ static __always_inline void
update_lru_sizes(struct lruvec *lruvec,
*/
static bool skip_cma(struct folio *folio, struct scan_control *sc)
{
- return !current_is_kswapd() &&
+ return !current_is_kswapd() && !sc->no_skip_cma &&
gfp_migratetype(sc->gfp_mask) != MIGRATE_MOVABLE &&
folio_migratetype(folio) == MIGRATE_CMA;
}
@@ -1636,7 +1639,7 @@ static unsigned long isolate_lru_folios(unsigned
long nr_to_scan,
unsigned long nr_taken = 0;
unsigned long nr_zone_taken[MAX_NR_ZONES] = { 0 };
unsigned long nr_skipped[MAX_NR_ZONES] = { 0, };
- unsigned long skipped = 0;
+ unsigned long skipped = 0, skipped_cma = 0;
unsigned long scan, total_scan, nr_pages;
LIST_HEAD(folios_skipped);
@@ -1645,6 +1648,7 @@ static unsigned long isolate_lru_folios(unsigned
long nr_to_scan,
while (scan < nr_to_scan && !list_empty(src)) {
struct list_head *move_to = src;
struct folio *folio;
+ bool cma = false;
folio = lru_to_folio(src);
prefetchw_prev_lru_folio(folio, src, flags);
@@ -1652,10 +1656,11 @@ static unsigned long
isolate_lru_folios(unsigned long nr_to_scan,
nr_pages = folio_nr_pages(folio);
total_scan += nr_pages;
- if (folio_zonenum(folio) > sc->reclaim_idx ||
- skip_cma(folio, sc)) {
+ cma = skip_cma(folio, sc);
+ if (folio_zonenum(folio) > sc->reclaim_idx || cma) {
nr_skipped[folio_zonenum(folio)] += nr_pages;
move_to = &folios_skipped;
+ skipped_cma += nr_pages;
goto move;
}
@@ -1716,6 +1721,16 @@ static unsigned long
isolate_lru_folios(unsigned long nr_to_scan,
*nr_scanned = total_scan;
trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, nr_to_scan,
total_scan, skipped, nr_taken, lru);
+
+ /*
+ * If the majority of folios are skipped due to contiguous
memory allocator (CMA),
+ * the least recently used (LRU) list might become overloaded
with CMA folios.
+ * This could result in numerous idle loops for non-movable
allocations. On the
+ * other hand, reclaiming CMA folios can alleviate the strain
on other processes
+ * that require anonymous pages.
+ */
+ sc->no_skip_cma = skipped_cma > total_scan / 2 ? 1 : 0;
+
update_lru_sizes(lruvec, lru, nr_zone_taken);
return nr_taken;
}
This is merely a proof of concept (PoC), and I haven't even built it
yet. Additionally, LRU_GEN remains
unchanged. It will also need similar changes.
>
> Brs,
> Hailong.
Thanks
Barry
Powered by blists - more mailing lists