[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGsJ_4x-p+8SzyHQq_EJpbq+hSEu5MCtwpGWvafpk4xfpB1gKg@mail.gmail.com>
Date: Sun, 25 Feb 2024 03:50:48 +0800
From: Barry Song <21cnbao@...il.com>
To: SeongJae Park <sj@...nel.org>
Cc: akpm@...ux-foundation.org, damon@...ts.linux.dev, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, minchan@...nel.org, mhocko@...e.com,
hannes@...xchg.org, Barry Song <v-songbaohua@...o.com>
Subject: Re: [PATCH RFC] mm: madvise: pageout: ignore references rather than
clearing young
On Sun, Feb 25, 2024 at 3:02 AM SeongJae Park <sj@...nel.org> wrote:
>
> On Fri, 23 Feb 2024 17:15:50 +1300 Barry Song <21cnbao@...il.com> wrote:
>
> > From: Barry Song <v-songbaohua@...o.com>
> >
> > While doing MADV_PAGEOUT, the current code will clear PTE young
> > so that vmscan won't read young flags to allow the reclamation
> > of madvised folios to go ahead.
> > It seems we can do it by directly ignoring references, thus we
> > can remove tlb flush in madvise and rmap overhead in vmscan.
> >
> > Regarding the side effect, in the original code, if a parallel
> > thread runs side by side to access the madvised memory with the
> > thread doing madvise, folios will get a chance to be re-activated
> > by vmscan. But with the patch, they will still be reclaimed. But
> > this behaviour doing PAGEOUT and doing access at the same time is
> > quite silly like DoS. So probably, we don't need to care.
>
> I think we might need to take care of the case, since users may use just a
> best-effort estimation like DAMON for the target pages. In such cases, the
> page granularity re-check of the access could be helpful. So I concern if this
> could be a visible behavioral change for some valid use cases.
Hi SeongJae,
If you read the code of MADV_PAGEOUT, you will find it is not the best-effort.
It does clearing pte young and immediately after the ptes are cleared, it reads
pte and checks if the ptes are young. If not, reclaim it. So the
purpose of clearing
PTE young is helping the check of young in folio_references to return false.
The gap between clearing ptes and re-checking ptes is quite small at
microseconds
level.
>
> >
> > A microbench as below has shown 6% decrement on the latency of
> > MADV_PAGEOUT,
>
> I assume some of the users may use MADV_PAGEOUT for proactive reclamation of
> the memory. In the use case, I think latency of MADV_PAGEOUT might be not that
> important.
>
> Hence I think the cons of the behavioral change might outweigh the pros of the
> latench improvement, for such best-effort proactive reclamation use case. Hope
> to hear and learn from others' opinions.
I don't see the behavioral change for MADV_PAGEOUT as just the ping-pong
is removed. The only chance is in that very small time gap, somebody accesses
the cleared ptes and makes it young again, considering this time gap
is so small,
i don't think it is worth caring. thus, i don't see pros for MADV_PAGEOUT case,
but we improve the efficiency of MADV_PAGEOUT and save the power of
Android phones.
>
> >
> > #define PGSIZE 4096
> > main()
> > {
> > int i;
> > #define SIZE 512*1024*1024
> > volatile long *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
> > MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> >
> > for (i = 0; i < SIZE/sizeof(long); i += PGSIZE / sizeof(long))
> > p[i] = 0x11;
> >
> > madvise(p, SIZE, MADV_PAGEOUT);
> > }
> >
> > w/o patch w/ patch
> > root@10:~# time ./a.out root@10:~# time ./a.out
> > real 0m49.634s real 0m46.334s
> > user 0m0.637s user 0m0.648s
> > sys 0m47.434s sys 0m44.265s
> >
> > Signed-off-by: Barry Song <v-songbaohua@...o.com>
>
>
Thanks
Barry
Powered by blists - more mailing lists