[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20090427174944.86dbb94c.kamezawa.hiroyu@jp.fujitsu.com>
Date: Mon, 27 Apr 2009 17:49:44 +0900
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To: balbir@...ux.vnet.ibm.com
Cc: Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"hugh@...itas.com" <hugh@...itas.com>
Subject: Re: [RFC][PATCH] fix swap entries is not reclaimed in proper way
for memg v3.
On Mon, 27 Apr 2009 14:13:47 +0530
Balbir Singh <balbir@...ux.vnet.ibm.com> wrote:
> > I like to. But there is no space to record it as stale. And "race" makes
> > that difficult even if we have enough space. If you read the whole thread,
> > you know there are many patterns of race.
>
> There have been several iterations of this discussion, summarizing it
> would be nice, let me find the thread.
>
At first, it's obious that there are no free space in swap entry array and
swap_cgroup array. (And this can be trouble even if MEM_RES_CONTROLLER_SWAP_EXT
is not used.)
I tried to record "stale" information to page_cgroup with flag, but there is
following sequence and I can't do it.
==
CPU0(zap_pte) CPU1 (read swap)
swap_duplicate()
free_swapentry()
add_to_swap_cache().
==
In this case, we can't know swap_entry is stale or not at zap_pte().
> >
> > > 2. Can't we reclaim stale entries during memcg LRU reclaim? Why write
> > > a GC for it?
> > >
> > Because they are not on memcg LRU. we can't reclaim it by memcg LRU.
> > (See the first mail from Nishimura of this thread. It explains well.)
> >
>
> Hmm.. I don't find it, let me do a more exhaustive search on the web.
> If the entry is stale and not on memcg LRU, it is still accounted to
> the memcg?
yes. accoutned to memcg.memsw.usage_in_bytes.
>
> > One easy case is here.
> >
> > - CPU0 call zap_pte()->free_swap_and_cache()
> > - CPU1 tries to swap-in it.
> > In this case, free_swap_and_cache() doesn't free swp_entry and swp_entry
> > is read into the memory. But it will never be added memcg's LRU until
> > it's mapped.
>
> That is strange.. not even added to the LRU as a cached page?
>
added to "global" LRU but not to "memcg's LRU" because "USED" bit is not set.
> > (What we have to consider here is swapin-readahead. It can swap-in memory
> > even if it's not accessed. Then, this race window is larger than expected.)
> >
> > We can't use memcg's LRU then...what we can do is.
> >
> > - scanning global LRU all
> > or
> > - use some trick to reclaim them in lazy way.
> >
>
> Thanks for being patient, some of these questions have been discussed
> before I suppose. Let me dig out the thread.
>
Sorry for lack of explanation. I'll add more text to v4. patch.
Thanks,
-kame
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists