lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGWkznEsesvbaRqOeqOaYJnD5BYxNOuO57pNt+cM7yOQrdk1Pg@mail.gmail.com>
Date: Sat, 12 Oct 2024 09:49:48 +0800
From: Zhaoyang Huang <huangzhaoyang@...il.com>
To: Yu Zhao <yuzhao@...gle.com>
Cc: "zhaoyang.huang" <zhaoyang.huang@...soc.com>, Andrew Morton <akpm@...ux-foundation.org>, 
	linux-mm@...ck.org, linux-kernel@...r.kernel.org, steve.kang@...soc.com
Subject: Re: [PATCH] mm: throttle and inc min_seq when both page types reach MIN_NR_GENS

On Fri, Oct 11, 2024 at 4:02 PM Zhaoyang Huang <huangzhaoyang@...il.com> wrote:
>
> On Fri, Oct 11, 2024 at 12:37 AM Yu Zhao <yuzhao@...gle.com> wrote:
> >
> > On Wed, Oct 9, 2024 at 1:50 AM zhaoyang.huang <zhaoyang.huang@...soc.com> wrote:
> > >
> > > From: Zhaoyang Huang <zhaoyang.huang@...soc.com>
> > >
> > > The test case of [1] leads to system hang which caused by a local
> > > watchdog thread starved over 20s on a 5.5GB RAM ANDROID15(v6.6)
> > > system. This commit solve the issue by have the reclaimer be throttled
> > > and increase min_seq if both page types reach MIN_NR_GENS, which may
> > > introduce a livelock of switching type with holding lruvec->lru_lock.
> > >
> > > [1]
> > > launch below script 8 times simutanously which allocates 1GB virtual
> > > memory and access it from user space by each thread.
> > > $ costmem -c1024000 -b12800 -o0 &
> > >
> > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@...soc.com>
> > > ---
> > >  mm/vmscan.c | 16 ++++++++++++++--
> > >  1 file changed, 14 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > index cfa839284b92..83e450d0ce3c 100644
> > > --- a/mm/vmscan.c
> > > +++ b/mm/vmscan.c
> > > @@ -4384,11 +4384,23 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
> > >         int remaining = MAX_LRU_BATCH;
> > >         struct lru_gen_folio *lrugen = &lruvec->lrugen;
> > >         struct mem_cgroup *memcg = lruvec_memcg(lruvec);
> > > +       struct pglist_data *pgdat = lruvec_pgdat(lruvec);
> > >
> > >         VM_WARN_ON_ONCE(!list_empty(list));
> > >
> > > -       if (get_nr_gens(lruvec, type) == MIN_NR_GENS)
> > > -               return 0;
> > > +       if (get_nr_gens(lruvec, type) == MIN_NR_GENS) {
> > > +               /*
> > > +                * throttle for a while and then increase the min_seq since
> > > +                * both page types reach the limit.
> > > +                */
> >
> > Sorry but this isn't going to work because in try_to_inc_min_seq(), there is
> >    `while (min_seq[type] + MIN_NR_GENS <= lrugen->max_seq) {`
> > to prevent reclaimers from evicting hot memory -- they need to do aging first.
> Thanks for heads up. What I thought was assuming there is a running
> reclaimer will do the aging and the throttled reclaimers increase the
> min_seq when scheduled back and move on. Or could we just drop the
> lock and throttle for a while to avoid a livelock on 'type = !type'
> with holding the lock?
please find below for the lru_lock contention information[2] which we
get from syzkaller test. if the patch[1] is worth discussing which
introduces throttling direct reclaimer by judging the number of
isolated folios.

[1]
https://lore.kernel.org/all/20240716094348.2451312-1-zhaoyang.huang@unisoc.com/

[2]
[  295.163779][T8447@C5] preemptoff_warn: C5 T:<8447>syz.2.17
D:40.429ms F:295.123341s E:6.660 ms
[  295.165000][T8447@C5] preemptoff_warn: C5 enabled preempt at:
[  295.165000][T8447@C5] _raw_spin_unlock_irq+0x2c/0x5c
[  295.165000][T8447@C5] evict_folios+0x2504/0x3050
[  295.165000][T8447@C5] try_to_shrink_lruvec+0x40c/0x594
[  295.165000][T8447@C5] shrink_one+0x174/0x4cc
[  295.165000][T8447@C5] shrink_node+0x1c50/0x2088
[  295.165000][T8447@C5] do_try_to_free_pages+0x560/0xef8
[  295.165000][T8447@C5] try_to_free_pages+0x4e8/0xaf0
[  295.165000][T8447@C5] __alloc_pages_slowpath+0x92c/0x1c78
[  295.165000][T8447@C5] __alloc_pages+0x404/0x48c
[  295.166277][T298@C0] C0 T:<298>logd.writer D:42.389ms F:295.123885s
[  295.166337][T298@C0] C0 enabled IRQ at:
[  295.166337][T298@C0] _raw_spin_unlock_irq+0x20/0x5c
[  295.166337][T298@C0] evict_folios+0x2504/0x3050
[  295.166337][T298@C0] shrink_one+0x174/0x4cc
[  295.166337][T298@C0] shrink_node+0x1c50/0x2088
[  295.166337][T298@C0] do_try_to_free_pages+0x560/0xef8
[  295.166337][T298@C0] try_to_free_pages+0x4e8/0xaf0
[  295.166337][T298@C0] __alloc_pages_slowpath+0x92c/0x1c78
[  295.166337][T298@C0] __alloc_pages+0x404/0x48c
[  295.166337][T298@C0] erofs_allocpage+0x90/0xb0
[  295.167317][T298@C0] preemptoff_warn: C0 T:<298>logd.writer
D:43.424ms F:295.123888s
[  295.168484][T8210@C7] C7 T:<8210>syz-executor D:32.816ms F:295.135666s
[  295.168507][T8210@C7] C7 enabled IRQ at:
[  295.168507][T8210@C7] _raw_spin_unlock_irq+0x20/0x5c
[  295.168507][T8210@C7] evict_folios+0x2504/0x3050
[  295.168507][T8210@C7] shrink_one+0x174/0x4cc
[  295.168507][T8210@C7] shrink_node+0x1c50/0x2088
[  295.168507][T8210@C7] do_try_to_free_pages+0x560/0xef8
[  295.168507][T8210@C7] try_to_free_pages+0x4e8/0xaf0
[  295.168507][T8210@C7] __alloc_pages_slowpath+0x92c/0x1c78
[  295.168507][T8210@C7] __alloc_pages+0x404/0x48c
[  295.168507][T8210@C7] __get_free_pages+0x24/0x3c
[  295.168625][T8210@C7] preemptoff_warn: C7 T:<8210>syz-executor
D:32.956ms F:295.135666s
[  295.168645][T8210@C7] preemptoff_warn: C7 enabled preempt at:
[  295.168645][T8210@C7] _raw_spin_unlock_irq+0x2c/0x5c
[  295.168645][T8210@C7] evict_folios+0x2504/0x3050
[  295.168645][T8210@C7] try_to_shrink_lruvec+0x40c/0x594
[  295.168645][T8210@C7] shrink_one+0x174/0x4cc
[  295.168645][T8210@C7] shrink_node+0x1c50/0x2088
[  295.168645][T8210@C7] do_try_to_free_pages+0x560/0xef8
[  295.168645][T8210@C7] try_to_free_pages+0x4e8/0xaf0
[  295.168645][T8210@C7] __alloc_pages_slowpath+0x92c/0x1c78
[  295.168645][T8210@C7] __alloc_pages+0x404/0x48c
[  295.178291][T8441@C2] C2 T:<8441>syz.3.18 D:42.290ms F:295.135998s
[  295.178356][T8441@C2] C2 enabled IRQ at:
[  295.178356][T8441@C2] _raw_spin_unlock_irq+0x20/0x5c
[  295.178356][T8441@C2] evict_folios+0x2504/0x3050
[  295.178356][T8441@C2] shrink_one+0x174/0x4cc
[  295.178356][T8441@C2] shrink_node+0x1c50/0x2088
[  295.178356][T8441@C2] do_try_to_free_pages+0x560/0xef8
[  295.178356][T8441@C2] try_to_free_pages+0x4e8/0xaf0
[  295.178356][T8441@C2] __alloc_pages_slowpath+0x92c/0x1c78
[  295.178356][T8441@C2] __alloc_pages+0x404/0x48c
[  295.178356][T8441@C2] bpf_ringbuf_alloc+0x22c/0x434
[  295.179135][T8441@C2] preemptoff_warn: C2 T:<8441>syz.3.18
D:43.128ms F:295.136000s

>
> >
> > >
> > > +               if (get_nr_gens(lruvec, !type) == MIN_NR_GENS) {
> > > +                       spin_unlock_irq(&lruvec->lru_lock);
> > > +                       reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED);
> > > +                       spin_lock_irq(&lruvec->lru_lock);
> > > +                       try_to_inc_min_seq(lruvec, get_swappiness(lruvec, sc));
> > > +               } else
> > > +                       return 0;
> > > +       }
> > >
> > >         gen = lru_gen_from_seq(lrugen->min_seq[type]);
> > >
> > > --
> > > 2.25.1
> > >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ