[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20111014135404.b56bed48.kamezawa.hiroyu@jp.fujitsu.com>
Date: Fri, 14 Oct 2011 13:54:04 +0900
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To: Michel Lespinasse <walken@...gle.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Dave Hansen <dave@...ux.vnet.ibm.com>,
Rik van Riel <riel@...hat.com>,
Balbir Singh <bsingharora@...il.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Andrea Arcangeli <aarcange@...hat.com>,
Johannes Weiner <jweiner@...hat.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Hugh Dickins <hughd@...gle.com>,
Michael Wolf <mjwolf@...ibm.com>
Subject: Re: [PATCH 6/9] kstaled: rate limit pages scanned per second.
On Thu, 13 Oct 2011 18:25:06 -0700
Michel Lespinasse <walken@...gle.com> wrote:
> On Wed, Sep 28, 2011 at 1:59 AM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@...fujitsu.com> wrote:
> > On Wed, 28 Sep 2011 01:19:50 -0700
> > Michel Lespinasse <walken@...gle.com> wrote:
> >> It tends to perform worse if we try making it multithreaded. What
> >> happens is that the scanning threads call page_referenced() a lot, and
> >> if they both try scanning pages that belong to the same file that
> >> causes the mapping's i_mmap_mutex lock to bounce. Same things happens
> >> if they try scanning pages that belong to the same anon VMA too.
> >>
> >
> > Hmm. with brief thinking, if you can scan list of page tables,
> > you can set young flags without any locks.
> > For inode pages, you can hook page lookup, I think.
>
> It would be possible to avoid taking rmap locks by instead scanning
> all page tables, and transferring the pte young bits observed there to
> the PageYoung page flag. This is a significant design change, but
> would indeed work.
>
> Just to clarify the idea, how would you go about finding all page
> tables to scan ? The most straightforward approach would be iterate
> over all processes and scan their address spaces, but I don't think we
> can afford to hold tasklist_lock (even for reads) for so long, so we'd
> have to be a bit smarter than that... I can think of a few different
> ways but I'd like to know if you have something specific in mind
> first.
Maybe there are several idea.
1. how about chasing "pgd" kmem_cache ?
I'm not sure but in x86 it seems all pgds are lined to pgd_list.
Now, it's not RCU list but making it as RCU list isn't hard.
Note: IIUC, struct page for pgd contains pointer to mm_struct.
2. track dup_mm and do_exec.
insert hook and maintain list of mm_struct.(It's not needed to be
implemented as list)
3. Like pgd_list, add some flag to pgd pages. Then, you can scan memmap
and find 'pgd' page and walk into the page table tree.
Hmm ?
Thanks,
-Kame
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists