linux-kernel - Re: [PATCH 2/4] Add replace_page(), change the mapping of pte from one page into another

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20081113151129.35c17962.kamezawa.hiroyu@jp.fujitsu.com>
Date:	Thu, 13 Nov 2008 15:11:29 +0900
From:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To:	Izik Eidus <izik@...ranet.com>
Cc:	Avi Kivity <avi@...hat.com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Izik Eidus <ieidus@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	kvm@...r.kernel.org, chrisw@...hat.com, izike@...ranet.com
Subject: Re: [PATCH 2/4] Add replace_page(), change the mapping of pte from
 one page into another

Thank you for answers.

On Wed, 12 Nov 2008 13:11:12 +0200
Izik Eidus <izik@...ranet.com> wrote:

> Avi Kivity wrote:
> > KAMEZAWA Hiroyuki wrote:
> >> Can I make a question ? (I'm working for memory cgroup.)
> >>
> >> Now, we do charge to anonymous page when
> >>   - charge(+1) when it's mapped firstly (mapcount 0->1)
> >>   - uncharge(-1) it's fully unmapped (mapcount 1->0) vir 
> >> page_remove_rmap().
> >>
> >> My quesion is
> >>  - PageKSM pages are not necessary to be tracked by memory cgroup ?
> When we reaplacing page using page_replace() we have:
> oldpage - > anonymous page that is going to be replaced by newpage
> newpage -> kernel allocated page (KsmPage)
> so about oldpage we are calling page_remove_rmap() that will notify cgroup
> and about newpage it wont be count inside cgroup beacuse it is file rmap 
> page
> (we are calling to page_add_file_rmap), so right now PageKSM wont ever 
> be tracked by cgroup.
> 
If not in radix-tree, it's not tracked.
(But we don't want to track non-LRU pages which are not freeable.)


> >>  - Can we know that "the page is just replaced and we don't necessary 
> >> to do
> >>    charge/uncharge".
> 
> The caller of page_replace does know it, the only problem is that 
> page_remove_rmap()
> automaticly change the cgroup for anonymous pages,
> if we want it not to change the cgroup, we can:
> increase the cgroup count before page_remove (but in that case what 
> happen if we reach to the limit???)
> give parameter to page_remove_rmap() that we dont want the cgroup to be 
> changed.

Hmm, current mem cgroup works via page_cgroup struct to track pages.

   page <-> page_cgroup has one-to-one relation ship.

So, "exchanging page" itself causes trouble. But I may be able to provide
necessary hooks to you as I did in page migraiton.

> 
> >>  - annonymous page from KSM is worth to be tracked by memory cgroup ?
> >>    (IOW, it's on LRU and can be swapped-out ?)
> 
> KSM have no anonymous pages (it share anonymous pages into KsmPAGE -> 
> kernel allocated page without mapping)
> so it isnt in LRU and it cannt be swapped, only when KsmPAGEs will be 
> break by do_wp_page() the duplication will be able to swap.
> 
Ok, thank you for confirmation.

> >>   
> >
> > My feeling is that shared pages should be accounted as if they were 
> > not shared; that is, a share page should be accounted for each process 
> > that shares it.  Perhaps sharing within a cgroup should be counted as 
> > 1 page for all the ptes pointing to it.
> >
> >

If KSM pages are on radix-tree, it will be accounted automatically.
Now, we have "Unevictable" LRU and mlocked() pages are smartly isolated into its
own LRU. So, just doing

 - inode's radix-tree
 - make all pages mlocked.
 - provide special page fault handler for your purpose

is simple one. But ok, whatever implementation you'll do, I have to check it
and consider whether it should be tracked or not. Then, add codes to memcg to
track it or ignore it or comments on your patches ;)

It's helpful to add me to CC: when you post this set again.

Thanks,
-Kame






--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/