lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1808730857.1637296.1486566279643.JavaMail.zimbra@redhat.com>
Date:   Wed, 8 Feb 2017 10:04:39 -0500 (EST)
From:   Jerome Glisse <jglisse@...hat.com>
To:     Dave Hansen <dave.hansen@...el.com>
Cc:     Anshuman Khandual <khandual@...ux.vnet.ibm.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org, mhocko@...e.com,
        vbabka@...e.cz, mgorman@...e.de, minchan@...nel.org,
        aneesh kumar <aneesh.kumar@...ux.vnet.ibm.com>,
        bsingharora@...il.com, srikar@...ux.vnet.ibm.com,
        haren@...ux.vnet.ibm.com, dan j williams <dan.j.williams@...el.com>
Subject: Re: [RFC V2 12/12] mm: Tag VMA with VM_CDM flag explicitly during
 mbind(MPOL_BIND)

> On 01/30/2017 08:36 PM, Anshuman Khandual wrote:
> > On 01/30/2017 11:24 PM, Dave Hansen wrote:
> >> On 01/29/2017 07:35 PM, Anshuman Khandual wrote:
> >>> +		if ((new_pol->mode == MPOL_BIND)
> >>> +			&& nodemask_has_cdm(new_pol->v.nodes))
> >>> +			set_vm_cdm(vma);
> >> So, if you did:
> >>
> >> 	mbind(addr, PAGE_SIZE, MPOL_BIND, all_nodes, ...);
> >> 	mbind(addr, PAGE_SIZE, MPOL_BIND, one_non_cdm_node, ...);
> >>
> >> You end up with a VMA that can never have KSM done on it, etc...  Even
> >> though there's no good reason for it.  I guess /proc/$pid/smaps might be
> >> able to help us figure out what was going on here, but that still seems
> >> like an awful lot of damage.
> > 
> > Agreed, this VMA should not remain tagged after the second call. It does
> > not make sense. For this kind of scenarios we can re-evaluate the VMA
> > tag every time the nodemask change is attempted. But if we are looking for
> > some runtime re-evaluation then we need to steal some cycles are during
> > general VMA processing opportunity points like merging and split to do
> > the necessary re-evaluation. Should do we do these kind two kinds of
> > re-evaluation to be more optimal ?
> 
> I'm still unconvinced that you *need* detection like this.  Scanning big
> VMAs is going to be really painful.
> 
> I thought I asked before but I can't find it in this thread.  But, we
> have explicit interfaces for disabling KSM and khugepaged.  Why do we
> need implicit ones like this in addition to those?
> 

I said it in other part of the thread i think the vma flag is a no go. Because
it try to set something that is orthogonal to vma. That you want some vma to
use device memory on new allocation is a valid policy for a vma to have. But to
have a flag that say various kernel subsystem hey my memory is special skip me
is wrong.

The fact that you want to exclude device memory from KSM or autonuma is valid but
it should be done at struct page level ie KSM or autonuma should check the type
of page before doing anything. For CDM pages they would skip. It could be the flags
idea that was discussed.

The overhead of doing it at page level is far lower than trying to manage a vma
flags with all the issue related to vma merging, splitting and lifetime of such
flags. Moreover this flags is an all or nothing, it does not consider the case
where you have as much regular page as CDM page in a vma. It would block regular
page from under going the usual KSM/autonuma ...

I do strongly believe that this vma flag is a bad idea.

Cheers,
Jérôme

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ