lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <01ed36eb-bb1d-bb75-57f9-90159985e75e@linux.vnet.ibm.com>
Date:   Tue, 31 Jan 2017 10:40:07 +0530
From:   Anshuman Khandual <khandual@...ux.vnet.ibm.com>
To:     Dave Hansen <dave.hansen@...el.com>,
        Anshuman Khandual <khandual@...ux.vnet.ibm.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Cc:     mhocko@...e.com, vbabka@...e.cz, mgorman@...e.de,
        minchan@...nel.org, aneesh.kumar@...ux.vnet.ibm.com,
        bsingharora@...il.com, srikar@...ux.vnet.ibm.com,
        haren@...ux.vnet.ibm.com, jglisse@...hat.com,
        dan.j.williams@...el.com
Subject: Re: [RFC V2 11/12] mm: Tag VMA with VM_CDM flag during page fault

On 01/30/2017 11:21 PM, Dave Hansen wrote:
> Here's the flag definition:
> 
>> +#ifdef CONFIG_COHERENT_DEVICE
>> +#define VM_CDM		0x00800000	/* Contains coherent device memory */
>> +#endif
> 
> But it doesn't match the implementation:
> 
>> +#ifdef CONFIG_COHERENT_DEVICE
>> +static void mark_vma_cdm(nodemask_t *nmask,
>> +		struct page *page, struct vm_area_struct *vma)
>> +{
>> +	if (!page)
>> +		return;
>> +
>> +	if (vma->vm_flags & VM_CDM)
>> +		return;
>> +
>> +	if (nmask && !nodemask_has_cdm(*nmask))
>> +		return;
>> +
>> +	if (is_cdm_node(page_to_nid(page)))
>> +		vma->vm_flags |= VM_CDM;
>> +}
> 
> That flag is a one-way trip.  Any VMA with that flag set on it will keep
> it for the life of the VMA, despite whether it has CDM pages in it now
> or not.  Even if you changed the policy back to one that doesn't allow
> CDM and forced all the pages to be migrated out.

Right, we have this limitation right now. But as I have mentioned in the
reply on the other thread, will work towards both static and runtime
re-evaluation of the VMA flag next time around.

> 
> This also assumes that the only way to get a page mapped into a VMA is
> via alloc_pages_vma().  Do the NUMA migration APIs use this path?

Right now I have just taken care of these two paths.

* Page fault path
* mbind() path

agreed, will work on the NUMA migration APIs paths next. Wondering if
I need to update for migrate_pages() kernel API also as it will be
used by the driver or should the driver tag the VMA explicitly knowing
what has just happened ? I had also mentioned about this in the cover
letter :) But as you have pointed out will move the documentation
to the patches.

"
VM_CDM tagged VMA:

There are two parts to this problem.

* How to mark a VMA with VM_CDM ?
	- During page fault path
	- During mbind(MPOL_BIND) call
	- Any other paths ?
	- Should a driver mark a VMA with VM_CDM explicitly ?

* How VM_CDM marked VMA gets treated ?

	- Disabled from auto NUMA migrations
	- Disabled from KSM merging
	- Anything else ?
"

> 
> When you *set* this flag, you don't go and turn off KSM merging, for
> instance.  You keep it from being turned on from this point forward, but
> you don't turn it off.

I was in the impression that the KSM merging does not start unless we
do madvise(MADV_MERGEABLE) call on the VMA (where its blocked now). I
might be missing something here if it can start before hand.

> 
> This is happening with mmap_sem held for read.  Correct?  Is it OK that
> you're modifying the VMA?  That vm_flags manipulation is non-atomic, so
> how can that even be safe?

Hmm. should it be done with mmap_sem being held for write. Will look
into this further. But intercepting the page faults inside alloc_pages_vma()
for tagging the VMA is okay from over all design perspective ?. Or this
should be moved up or down the call chain in the page fault path ?

> 
> If you're going to go down this route, I think you need to be very
> careful.  We need to ensure that when this flag gets set, it's never set
> on VMAs that are "normal" and will only be set on VMAs that were
> *explicitly* set up for accessing CDM.  That means that you'll need to
> make sure that there's no possible way to get a CDM page faulted into a
> VMA unless it's via an explicitly assigned policy that would have cause
> the VMA to be split from any "normal" one in the system.
> 
> This all makes me really nervous.

Got it, will work towards this.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ