lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20130611145742.GB3411@sgi.com>
Date:	Tue, 11 Jun 2013 09:57:42 -0500
From:	Alex Thorlton <athorlton@....com>
To:	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc:	linux-kernel@...r.kernel.org, Li Zefan <lizefan@...wei.com>,
	Rob Landley <rob@...dley.net>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mel Gorman <mgorman@...e.de>, Rik van Riel <riel@...hat.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Xiao Guangrong <xiaoguangrong@...ux.vnet.ibm.com>,
	David Rientjes <rientjes@...gle.com>,
	linux-doc@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] Make transparent hugepages cpuset aware

On Tue, Jun 11, 2013 at 09:55:18AM +0300, Kirill A. Shutemov wrote:
> Alex Thorlton wrote:
> > This patch adds the ability to control THPs on a per cpuset basis.  Please see
> > the additions to Documentation/cgroups/cpusets.txt for more information.
> > 
> > Signed-off-by: Alex Thorlton <athorlton@....com>
> > Reviewed-by: Robin Holt <holt@....com>
> > Cc: Li Zefan <lizefan@...wei.com>
> > Cc: Rob Landley <rob@...dley.net>
> > Cc: Andrew Morton <akpm@...ux-foundation.org>
> > Cc: Mel Gorman <mgorman@...e.de>
> > Cc: Rik van Riel <riel@...hat.com>
> > Cc: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> > Cc: Johannes Weiner <hannes@...xchg.org>
> > Cc: Xiao Guangrong <xiaoguangrong@...ux.vnet.ibm.com>
> > Cc: David Rientjes <rientjes@...gle.com>
> > Cc: linux-doc@...r.kernel.org
> > Cc: linux-mm@...ck.org
> > ---
> >  Documentation/cgroups/cpusets.txt |  50 ++++++++++-
> >  include/linux/cpuset.h            |   5 ++
> >  include/linux/huge_mm.h           |  25 +++++-
> >  kernel/cpuset.c                   | 181 ++++++++++++++++++++++++++++++++++++++
> >  mm/huge_memory.c                  |   3 +
> >  5 files changed, 261 insertions(+), 3 deletions(-)
> > 
> > diff --git a/Documentation/cgroups/cpusets.txt b/Documentation/cgroups/cpusets.txt
> > index 12e01d4..b7b2c83 100644
> > --- a/Documentation/cgroups/cpusets.txt
> > +++ b/Documentation/cgroups/cpusets.txt
> > @@ -22,12 +22,14 @@ CONTENTS:
> >    1.6 What is memory spread ?
> >    1.7 What is sched_load_balance ?
> >    1.8 What is sched_relax_domain_level ?
> > -  1.9 How do I use cpusets ?
> > +  1.9 What is thp_enabled ?
> > +  1.10 How do I use cpusets ?
> >  2. Usage Examples and Syntax
> >    2.1 Basic Usage
> >    2.2 Adding/removing cpus
> >    2.3 Setting flags
> >    2.4 Attaching processes
> > +  2.5 Setting thp_enabled flags
> >  3. Questions
> >  4. Contact
> >  
> > @@ -581,7 +583,34 @@ If your situation is:
> >  then increasing 'sched_relax_domain_level' would benefit you.
> >  
> >  
> > -1.9 How do I use cpusets ?
> > +1.9 What is thp_enabled ?
> > +-----------------------
> > +
> > +The thp_enabled file contained within each cpuset controls how transparent
> > +hugepages are handled within that cpuset.
> > +
> > +The root cpuset's thp_enabled flags mirror the flags set in
> > +/sys/kernel/mm/transparent_hugepage/enabled.  The flags in the root cpuset can
> > +only be modified by changing /sys/kernel/mm/transparent_hugepage/enabled. The
> > +thp_enabled file for the root cpuset is read only.  These flags cause the
> > +root cpuset to behave as one might expect:
> > +
> > +- When set to always, THPs are used whenever practical
> > +- When set to madvise, THPs are used only on chunks of memory that have the
> > +  MADV_HUGEPAGE flag set
> > +- When set to never, THPs are never allowed for tasks in this cpuset
> > +
> > +The behavior of thp_enabled for children of the root cpuset is where things
> > +become a bit more interesting.  The child cpusets accept the same flags as the
> > +root, but also have a default flag, which, when set, causes a cpuset to use the
> > +behavior of its parent.  When a child cpuset is created, its default flag is
> > +always initially set.
> > +
> > +Since the flags on child cpusets are allowed to differ from the flags on their
> > +parents, we are able to enable THPs for tasks in specific cpusets, and disable
> > +them in others.
> 
> Should we have a way for parent cgroup can enforce child behaviour?
> Like a mask of allowed thp_enabled values children can choose.
> 

We don't have a use case for that particular scenario, so we didn't
include any such functionality.  Our main goal here was to allow 
cpusets to override the /sys/kernel/mm/transparent_hugepage/enabled 
setting.  If you have a use case for that scenario, then I think it
would be more suitable to add that functionality in a separate patch.

> > @@ -177,6 +177,29 @@ static inline struct page *compound_trans_head(struct page *page)
> >  	return page;
> >  }
> >  
> > +#ifdef CONFIG_CPUSETS
> > +extern int cpuset_thp_always(struct task_struct *p);
> > +extern int cpuset_thp_madvise(struct task_struct *p);
> > +
> > +static inline int transparent_hugepage_enabled(struct vm_area_struct *vma)
> > +{
> > +	if (cpuset_thp_always(current))
> > +		return 1;
> 
> Why do you ignore VM_NOHUGEPAGE?
> And !is_vma_temporary_stack(__vma) is still relevant.
> 

That was an oversight, on my part.  I've fixed it and will submit the
corrected patch shortly.  Thanks for pointing that out.

> > +	else if (cpuset_thp_madvise(current) &&
> > +		 ((vma)->vm_flags & VM_HUGEPAGE) &&
> > +		 !((vma)->vm_flags & VM_NOHUGEPAGE) &&
> > +		 !is_vma_temporary_stack(vma))
> > +		return 1;
> > +	else
> > +		return 0;
> > +}
> > +#else
> > +static inline int transparent_hugepage_enabled(struct vm_area_struct *vma)
> > +{
> > +	return _transparent_hugepage_enabled(vma);
> > +}
> > +#endif
> > +
> >  extern int do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
> >  				unsigned long addr, pmd_t pmd, pmd_t *pmdp);
> >  
> 
> -- 
>  Kirill A. Shutemov

- Alex Thorlton
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ