lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.02.1407231600110.1389@chino.kir.corp.google.com>
Date:	Wed, 23 Jul 2014 16:05:36 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Alex Thorlton <athorlton@....com>
cc:	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	akpm@...ux-foundation.org, mgorman@...e.de, riel@...hat.com,
	kirill.shutemov@...ux.intel.com, mingo@...nel.org,
	hughd@...gle.com, lliubbo@...il.com, hannes@...xchg.org,
	srivatsa.bhat@...ux.vnet.ibm.com, dave.hansen@...ux.intel.com,
	dfults@....com, hedi@....com
Subject: Re: [BUG] THP allocations escape cpuset when defrag is off

On Wed, 23 Jul 2014, Alex Thorlton wrote:

> > It's also been a long-standing issue that cpusets and mempolicies are 
> > ignored by khugepaged that allows memory to be migrated remotely to nodes 
> > that are not allowed by a cpuset's mems or a mempolicy's nodemask.  Even 
> > with this issue fixed, you may find that some memory is migrated remotely, 
> > although it may be negligible, by khugepaged.
> 
> A bit here and there is manageable.  There is, of course, some work to
> be done there, but for now we're mainly concerned with a job that's
> supposed to be confined to a cpuset spilling out and soaking up all the
> memory on a machine.
> 

You may find my patch[*] in -mm to be helpful if you enable 
zone_reclaim_mode.  It changes khugepaged so that it is not allowed to 
migrate any memory to a remote node where the distance between the nodes 
is greater than RECLAIM_DISTANCE.

These issues are still pending and we've encountered a couple of them in 
the past weeks ourselves.  The definition of RECLAIM_DISTANCE, currently 
at 30 for x86, is relying on the SLIT to define when remote access is 
costly and there are cases where people need to alter the BIOS to 
workaround this definition.

We can hope that NUMA balancing will solve a lot of these problems for us, 
but there's always a chance that the VM does something totally wrong which 
you've undoubtedly encountered already.

 [*] http://ozlabs.org/~akpm/mmots/broken-out/mm-thp-only-collapse-hugepages-to-nodes-with-affinity-for-zone_reclaim_mode.patch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ