linux-kernel - Re: [PATCH 0/4] RFC - ksm api change into madvise

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0906092013580.31606@sister.anvils>
Date:	Tue, 9 Jun 2009 20:27:29 +0100 (BST)
From:	Hugh Dickins <hugh.dickins@...cali.co.uk>
To:	Izik Eidus <ieidus@...hat.com>
cc:	aarcange@...hat.com, akpm@...ux-foundation.org,
	nickpiggin@...oo.com.au, chrisw@...hat.com, riel@...hat.com,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 0/4] RFC - ksm api change into madvise

On Tue, 9 Jun 2009, Hugh Dickins wrote:
> On Tue, 9 Jun 2009, Izik Eidus wrote:
> > How does this look like?
> 
> One improvment to make now, though: you've elsewhere avoided
> the pgd,pud,pmd,pte descent in ksm.c (using get_pte instead), and
> page_check_address() is not static to rmap.c (filemap_xip wanted it),
> so please continue to use that.  It's not exported, right, but I think
> Chris was already decisive that we should abandon modular KSM, yes?

I think you can simplify it further, can't you?  Isn't the get_pte()
preamble in try_to_merge_one_page() just unnecessary overhead now?  See
untested code below.  Or even move the trylock/unlock of the page into
write_protect_page if you prefer.  Later on we'll uninline rmap.c's
vma_address() so you can use it instead of your addr_in_vma() copy.

Hugh

static inline int write_protect_page(struct page *page,
				     struct vm_area_struct *vma,
				     pte_t *orig_pte)
{
	struct mm_struct *mm = vma->vm_mm;
	unsigned long addr;
	pte_t *ptep;
	spinlock_t *ptl;
	int swapped;
	int ret = 1;

	addr = addr_in_vma(vma, page);
	if (addr == -EFAULT)
		goto out;

	ptep = page_check_address(page, mm, addr, &ptl, 0);
	if (!ptep)
		goto out;

	if (pte_write(*ptep)) {
		pte_t entry;

		swapped = PageSwapCache(page);
		flush_cache_page(vma, addr, page_to_pfn(page));
		/*
		 * Ok this is tricky, when get_user_pages_fast() run it doesnt
		 * take any lock, therefore the check that we are going to make
		 * with the pagecount against the mapcount is racey and
		 * O_DIRECT can happen right after the check.
		 * So we clear the pte and flush the tlb before the check
		 * this assure us that no O_DIRECT can happen after the check
		 * or in the middle of the check.
		 */
		entry = ptep_clear_flush(vma, addr, ptep);
		/*
		 * Check that no O_DIRECT or similar I/O is in progress on the
		 * page
		 */
		if ((page_mapcount(page) + 2 + swapped) != page_count(page)) {
			set_pte_at_notify(mm, addr, ptep, entry);
			goto out_unlock;
		}
		entry = pte_wrprotect(entry);
		set_pte_at_notify(mm, addr, ptep, entry);
		*orig_pte = *ptep;
	}
	ret = 0;

out_unlock:
	pte_unmap_unlock(ptep, ptl);
out:
	return ret;
}

/*
 * try_to_merge_one_page - take two pages and merge them into one
 * @mm: mm_struct that hold vma pointing into oldpage
 * @vma: the vma that hold the pte pointing into oldpage
 * @oldpage: the page that we want to replace with newpage
 * @newpage: the page that we want to map instead of oldpage
 * @newprot: the new permission of the pte inside vma
 * note:
 * oldpage should be anon page while newpage should be file mapped page
 *
 * this function return 0 if the pages were merged, 1 otherwise.
 */
static int try_to_merge_one_page(struct mm_struct *mm,
				 struct vm_area_struct *vma,
				 struct page *oldpage,
				 struct page *newpage,
				 pgprot_t newprot)
{
	int ret = 1;
	pte_t orig_pte;

	if (!PageAnon(oldpage))
		goto out;

	get_page(newpage);
	get_page(oldpage);

	/*
	 * we need the page lock to read a stable PageSwapCache in
	 * write_protect_page().
	 * we use trylock_page() instead of lock_page(), beacuse we dont want to
	 * wait here, we prefer to continue scanning and merging diffrent pages
	 * and to come back to this page when it is unlocked.
	 */
	if (!trylock_page(oldpage))
		goto out_putpage;

	if (write_protect_page(oldpage, vma, &orig_pte)) {
		unlock_page(oldpage);
		goto out_putpage;
	}
	unlock_page(oldpage);

	if (pages_identical(oldpage, newpage))
		ret = replace_page(vma, oldpage, newpage, orig_pte, newprot);

out_putpage:
	put_page(oldpage);
	put_page(newpage);
out:
	return ret;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/