[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0906092013580.31606@sister.anvils>
Date: Tue, 9 Jun 2009 20:27:29 +0100 (BST)
From: Hugh Dickins <hugh.dickins@...cali.co.uk>
To: Izik Eidus <ieidus@...hat.com>
cc: aarcange@...hat.com, akpm@...ux-foundation.org,
nickpiggin@...oo.com.au, chrisw@...hat.com, riel@...hat.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 0/4] RFC - ksm api change into madvise
On Tue, 9 Jun 2009, Hugh Dickins wrote:
> On Tue, 9 Jun 2009, Izik Eidus wrote:
> > How does this look like?
>
> One improvment to make now, though: you've elsewhere avoided
> the pgd,pud,pmd,pte descent in ksm.c (using get_pte instead), and
> page_check_address() is not static to rmap.c (filemap_xip wanted it),
> so please continue to use that. It's not exported, right, but I think
> Chris was already decisive that we should abandon modular KSM, yes?
I think you can simplify it further, can't you? Isn't the get_pte()
preamble in try_to_merge_one_page() just unnecessary overhead now? See
untested code below. Or even move the trylock/unlock of the page into
write_protect_page if you prefer. Later on we'll uninline rmap.c's
vma_address() so you can use it instead of your addr_in_vma() copy.
Hugh
static inline int write_protect_page(struct page *page,
struct vm_area_struct *vma,
pte_t *orig_pte)
{
struct mm_struct *mm = vma->vm_mm;
unsigned long addr;
pte_t *ptep;
spinlock_t *ptl;
int swapped;
int ret = 1;
addr = addr_in_vma(vma, page);
if (addr == -EFAULT)
goto out;
ptep = page_check_address(page, mm, addr, &ptl, 0);
if (!ptep)
goto out;
if (pte_write(*ptep)) {
pte_t entry;
swapped = PageSwapCache(page);
flush_cache_page(vma, addr, page_to_pfn(page));
/*
* Ok this is tricky, when get_user_pages_fast() run it doesnt
* take any lock, therefore the check that we are going to make
* with the pagecount against the mapcount is racey and
* O_DIRECT can happen right after the check.
* So we clear the pte and flush the tlb before the check
* this assure us that no O_DIRECT can happen after the check
* or in the middle of the check.
*/
entry = ptep_clear_flush(vma, addr, ptep);
/*
* Check that no O_DIRECT or similar I/O is in progress on the
* page
*/
if ((page_mapcount(page) + 2 + swapped) != page_count(page)) {
set_pte_at_notify(mm, addr, ptep, entry);
goto out_unlock;
}
entry = pte_wrprotect(entry);
set_pte_at_notify(mm, addr, ptep, entry);
*orig_pte = *ptep;
}
ret = 0;
out_unlock:
pte_unmap_unlock(ptep, ptl);
out:
return ret;
}
/*
* try_to_merge_one_page - take two pages and merge them into one
* @mm: mm_struct that hold vma pointing into oldpage
* @vma: the vma that hold the pte pointing into oldpage
* @oldpage: the page that we want to replace with newpage
* @newpage: the page that we want to map instead of oldpage
* @newprot: the new permission of the pte inside vma
* note:
* oldpage should be anon page while newpage should be file mapped page
*
* this function return 0 if the pages were merged, 1 otherwise.
*/
static int try_to_merge_one_page(struct mm_struct *mm,
struct vm_area_struct *vma,
struct page *oldpage,
struct page *newpage,
pgprot_t newprot)
{
int ret = 1;
pte_t orig_pte;
if (!PageAnon(oldpage))
goto out;
get_page(newpage);
get_page(oldpage);
/*
* we need the page lock to read a stable PageSwapCache in
* write_protect_page().
* we use trylock_page() instead of lock_page(), beacuse we dont want to
* wait here, we prefer to continue scanning and merging diffrent pages
* and to come back to this page when it is unlocked.
*/
if (!trylock_page(oldpage))
goto out_putpage;
if (write_protect_page(oldpage, vma, &orig_pte)) {
unlock_page(oldpage);
goto out_putpage;
}
unlock_page(oldpage);
if (pages_identical(oldpage, newpage))
ret = replace_page(vma, oldpage, newpage, orig_pte, newprot);
out_putpage:
put_page(oldpage);
put_page(newpage);
out:
return ret;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists