linux-kernel - Re: [PATCH v8 05/22] Add vm_replace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140728132558.GA967@node.dhcp.inet.fi>
Date:	Mon, 28 Jul 2014 16:25:58 +0300
From:	"Kirill A. Shutemov" <kirill@...temov.name>
To:	Matthew Wilcox <willy@...ux.intel.com>
Cc:	Matthew Wilcox <matthew.r.wilcox@...el.com>,
	linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v8 05/22] Add vm_replace_mixed()

On Fri, Jul 25, 2014 at 03:44:50PM -0400, Matthew Wilcox wrote:
> On Wed, Jul 23, 2014 at 06:55:00PM +0300, Kirill A. Shutemov wrote:
> > >         update_hiwater_rss(mm);
> > 
> > No: you cannot end up with lower rss after replace, iiuc.
> 
> Actually, you can ... when we replace a real page with a PFN, our rss
> decreases.

Okay.

> > Do you mean you pointed to new file all the time? O_CREAT doesn't truncate
> > file if it exists, iirc.
> 
> It was pointing to a new file.  Still not sure why that one failed to trigger
> the problem.  The slightly modified version attached triggered the problem
> *just fine* :-)
> 
> I've attached all the patches in my tree so far.  For the v9 patch kit,
> I'll keep patch 3 as a separate patch, but roll patches 1, 2 and 4 into
> other patches.
> 
> I am seeing something odd though.  When I run double-map with debugging
> printks inserted in strategic spots in the kernel, I see four calls to
> do_dax_fault().  The first two, as expected, are the loads from the two
> mapped addresses.  The third is via mkwrite, but then the fourth time
> I get a regular page fault for write, and I don't understand why I get it.
> 
> Any ideas?

unmap_mapping_range() clears pte you've just set by vm_replace_mixed() on
third fault.

And locking looks wrong: it seems you need to hold i_mmap_mutex while
replacing hole page with pfn. Your VM_BUG_ON() in zap_pte_single()
triggers on my setup.

> +static void zap_pte_single(struct vm_area_struct *vma, pte_t *pte,
> +				unsigned long addr)
> +{
> +	struct mm_struct *mm = vma->vm_mm;
> +	int force_flush = 0;
> +	int rss[NR_MM_COUNTERS];
> +
> +	VM_BUG_ON(!mutex_is_locked(&vma->vm_file->f_mapping->i_mmap_mutex));

It's wrong place for VM_BUG_ON(): zap_pte_single() on anon mapping should
work fine)

> +
> +	init_rss_vec(rss);

Vector to commit single update to mm counters? What about inline counters
update for rss == NULL case?

> +	update_hiwater_rss(mm);
> +	flush_cache_page(vma, addr, pte_pfn(*pte));
> +	zap_pte(NULL, vma, pte, addr, NULL, rss, &force_flush);
> +	flush_tlb_page(vma, addr);
> +	add_mm_rss_vec(mm, rss);
> +}
> +

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/