lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMgjq7AnaNr354zzu-Z-SB6xZtD1+a2zUwFtZ_Qg7pMj0m7y7A@mail.gmail.com>
Date: Mon, 2 Sep 2024 04:39:24 +0800
From: Kairui Song <ryncsn@...il.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Jingxiang Zeng <linuszeng@...cent.com>, Jingxiang Zeng <jingxiangzeng.cas@...il.com>, 
	linux-mm@...ck.org, Yu Zhao <yuzhao@...gle.com>, Wei Xu <weixugc@...gle.com>, 
	"T . J . Mercier" <tjmercier@...gle.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/vmscan: wake up flushers conditionally to avoid cgroup OOM

On Sat, Aug 31, 2024 at 8:38 AM Andrew Morton <akpm@...ux-foundation.org> wrote:
>
> On Thu, 29 Aug 2024 18:25:43 +0800 Jingxiang Zeng <jingxiangzeng.cas@...il.com> wrote:
>
> > From: Zeng Jingxiang <linuszeng@...cent.com>
> >
> > Commit 14aa8b2d5c2e ("mm/mglru: don't sync disk for each aging cycle")
> > removed the opportunity to wake up flushers during the MGLRU page
> > reclamation process can lead to an increased likelihood of triggering
> > OOM when encountering many dirty pages during reclamation on MGLRU.
> >
> > This leads to premature OOM if there are too many dirty pages in cgroup:
> > Killed
> >
> > ...
> >
> > The flusher wake up was removed to decrease SSD wearing, but if we are
> > seeing all dirty folios at the tail of an LRU, not waking up the flusher
> > could lead to thrashing easily. So wake it up when a mem cgroups is
> > about to OOM due to dirty caches.
>
> Thanks, I'll queue this for testing and review.  Could people please
> consider whether we should backport this into -stable kernels.
>

Hi Andrew, Thanks for picking this up.

> > MGLRU still suffers OOM issue on latest mm tree, so the test is done
> > with another fix merged [1].
> >
> > Link: https://lore.kernel.org/linux-mm/CAOUHufYi9h0kz5uW3LHHS3ZrVwEq-kKp8S6N-MZUmErNAXoXmw@mail.gmail.com/ [1]
>
> This one is already queued for -stable.

I didn't see this in -unstable or -stable though, is there any other
repo or branch I missed? Jingxiang is referring to this fix from Yu:

diff --git a/mm/vmscan.c b/mm/vmscan.c
index cfa839284b92..778bf5b7ef97 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4320,7 +4320,7 @@ static bool sort_folio(struct lruvec *lruvec,
struct folio *folio, struct scan_c
        }

        /* ineligible */
-       if (zone > sc->reclaim_idx || skip_cma(folio, sc)) {
+       if (!folio_test_lru(folio) || zone > sc->reclaim_idx ||
skip_cma(folio, sc)) {
                gen = folio_inc_gen(lruvec, folio, false);
                list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]);
                return true;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ