lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 19 Jun 2019 15:24:50 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Minchan Kim <minchan@...nel.org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        linux-mm <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>, linux-api@...r.kernel.org,
        Johannes Weiner <hannes@...xchg.org>,
        Tim Murray <timmurray@...gle.com>,
        Joel Fernandes <joel@...lfernandes.org>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Daniel Colascione <dancol@...gle.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        Sonny Rao <sonnyrao@...gle.com>,
        Brian Geffon <bgeffon@...gle.com>, jannh@...gle.com,
        oleg@...hat.com, christian@...uner.io, oleksandr@...hat.com,
        hdanton@...a.com, lizeb@...gle.com
Subject: Re: [PATCH v2 4/5] mm: introduce MADV_PAGEOUT

On Mon 10-06-19 20:12:51, Minchan Kim wrote:
[...]
> +static int madvise_pageout_pte_range(pmd_t *pmd, unsigned long addr,
> +				unsigned long end, struct mm_walk *walk)

Again the same question about a potential code reuse...
[...]
> +regular_page:
> +	tlb_change_page_size(tlb, PAGE_SIZE);
> +	orig_pte = pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> +	flush_tlb_batched_pending(mm);
> +	arch_enter_lazy_mmu_mode();
> +	for (; addr < end; pte++, addr += PAGE_SIZE) {
> +		ptent = *pte;
> +		if (!pte_present(ptent))
> +			continue;
> +
> +		page = vm_normal_page(vma, addr, ptent);
> +		if (!page)
> +			continue;
> +
> +		if (isolate_lru_page(page))
> +			continue;
> +
> +		isolated++;
> +		if (pte_young(ptent)) {
> +			ptent = ptep_get_and_clear_full(mm, addr, pte,
> +							tlb->fullmm);
> +			ptent = pte_mkold(ptent);
> +			set_pte_at(mm, addr, pte, ptent);
> +			tlb_remove_tlb_entry(tlb, pte, addr);
> +		}
> +		ClearPageReferenced(page);
> +		test_and_clear_page_young(page);
> +		list_add(&page->lru, &page_list);
> +		if (isolated >= SWAP_CLUSTER_MAX) {

Why do we need SWAP_CLUSTER_MAX batching? Especially when we need ...
[...]

> +unsigned long reclaim_pages(struct list_head *page_list)
> +{
> +	int nid = -1;
> +	unsigned long nr_reclaimed = 0;
> +	LIST_HEAD(node_page_list);
> +	struct reclaim_stat dummy_stat;
> +	struct scan_control sc = {
> +		.gfp_mask = GFP_KERNEL,
> +		.priority = DEF_PRIORITY,
> +		.may_writepage = 1,
> +		.may_unmap = 1,
> +		.may_swap = 1,
> +	};
> +
> +	while (!list_empty(page_list)) {
> +		struct page *page;
> +
> +		page = lru_to_page(page_list);
> +		if (nid == -1) {
> +			nid = page_to_nid(page);
> +			INIT_LIST_HEAD(&node_page_list);
> +		}
> +
> +		if (nid == page_to_nid(page)) {
> +			list_move(&page->lru, &node_page_list);
> +			continue;
> +		}
> +
> +		nr_reclaimed += shrink_page_list(&node_page_list,
> +						NODE_DATA(nid),
> +						&sc, 0,
> +						&dummy_stat, false);

per-node batching in fact. Other than that nothing really jumped at me.
Except for the shared page cache side channel timing aspect not being
considered AFAICS. To be more specific. Pushing out a shared page cache
is possible even now but this interface gives a much easier tool to
evict shared state and perform all sorts of timing attacks. Unless I am
missing something we should be doing something similar to mincore and
ignore shared pages without a writeable access or at least document why
we do not care.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ