[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190528105806.GA21060@google.com>
Date: Tue, 28 May 2019 19:58:06 +0900
From: Minchan Kim <minchan@...nel.org>
To: Hillf Danton <hdanton@...a.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>, Michal Hocko <mhocko@...e.com>,
Johannes Weiner <hannes@...xchg.org>,
Tim Murray <timmurray@...gle.com>,
Joel Fernandes <joel@...lfernandes.org>,
Suren Baghdasaryan <surenb@...gle.com>,
Daniel Colascione <dancol@...gle.com>,
Shakeel Butt <shakeelb@...gle.com>,
Sonny Rao <sonnyrao@...gle.com>,
Brian Geffon <bgeffon@...gle.com>
Subject: Re: [RFC 1/7] mm: introduce MADV_COOL
On Tue, May 28, 2019 at 04:53:01PM +0800, Hillf Danton wrote:
>
> On Mon, 20 May 2019 12:52:48 +0900 Minchan Kim wrote:
> > +static int madvise_cool_pte_range(pmd_t *pmd, unsigned long addr,
> > + unsigned long end, struct mm_walk *walk)
> > +{
> > + pte_t *orig_pte, *pte, ptent;
> > + spinlock_t *ptl;
> > + struct page *page;
> > + struct vm_area_struct *vma = walk->vma;
> > + unsigned long next;
> > +
> > + next = pmd_addr_end(addr, end);
> > + if (pmd_trans_huge(*pmd)) {
> > + spinlock_t *ptl;
>
> Seems not needed with another ptl declared above.
Will remove it.
> > +
> > + ptl = pmd_trans_huge_lock(pmd, vma);
> > + if (!ptl)
> > + return 0;
> > +
> > + if (is_huge_zero_pmd(*pmd))
> > + goto huge_unlock;
> > +
> > + page = pmd_page(*pmd);
> > + if (page_mapcount(page) > 1)
> > + goto huge_unlock;
> > +
> > + if (next - addr != HPAGE_PMD_SIZE) {
> > + int err;
>
> Alternately, we deactivate thp only if the address range from userspace
> is sane enough, in order to avoid complex works we have to do here.
Not sure it's a good idea. That's the way we have done in MADV_FREE
so want to be consistent.
> > +
> > + get_page(page);
> > + spin_unlock(ptl);
> > + lock_page(page);
> > + err = split_huge_page(page);
> > + unlock_page(page);
> > + put_page(page);
> > + if (!err)
> > + goto regular_page;
> > + return 0;
> > + }
> > +
> > + pmdp_test_and_clear_young(vma, addr, pmd);
> > + deactivate_page(page);
> > +huge_unlock:
> > + spin_unlock(ptl);
> > + return 0;
> > + }
> > +
> > + if (pmd_trans_unstable(pmd))
> > + return 0;
> > +
> > +regular_page:
>
> Take a look at pending signal?
Do you have any reason to see pending signal here? I want to know what's
your requirement so that what's the better place to handle it.
>
> > + orig_pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> > + for (pte = orig_pte; addr < end; pte++, addr += PAGE_SIZE) {
>
> s/end/next/ ?
Why do you think it should be next?
> > + ptent = *pte;
> > +
> > + if (pte_none(ptent))
> > + continue;
> > +
> > + if (!pte_present(ptent))
> > + continue;
> > +
> > + page = vm_normal_page(vma, addr, ptent);
> > + if (!page)
> > + continue;
> > +
> > + if (page_mapcount(page) > 1)
> > + continue;
> > +
> > + ptep_test_and_clear_young(vma, addr, pte);
> > + deactivate_page(page);
> > + }
> > +
> > + pte_unmap_unlock(orig_pte, ptl);
> > + cond_resched();
> > +
> > + return 0;
> > +}
> > +
> > +static long madvise_cool(struct vm_area_struct *vma,
> > + unsigned long start_addr, unsigned long end_addr)
> > +{
> > + struct mm_struct *mm = vma->vm_mm;
> > + struct mmu_gather tlb;
> > +
> > + if (vma->vm_flags & (VM_LOCKED|VM_HUGETLB|VM_PFNMAP))
> > + return -EINVAL;
>
> No service in case of VM_IO?
I don't know VM_IO would have regular LRU pages but just follow normal
convention for DONTNEED and FREE.
Do you have anything in your mind?
Powered by blists - more mailing lists