[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.02.1407231600110.1389@chino.kir.corp.google.com>
Date: Wed, 23 Jul 2014 16:05:36 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: Alex Thorlton <athorlton@....com>
cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
akpm@...ux-foundation.org, mgorman@...e.de, riel@...hat.com,
kirill.shutemov@...ux.intel.com, mingo@...nel.org,
hughd@...gle.com, lliubbo@...il.com, hannes@...xchg.org,
srivatsa.bhat@...ux.vnet.ibm.com, dave.hansen@...ux.intel.com,
dfults@....com, hedi@....com
Subject: Re: [BUG] THP allocations escape cpuset when defrag is off
On Wed, 23 Jul 2014, Alex Thorlton wrote:
> > It's also been a long-standing issue that cpusets and mempolicies are
> > ignored by khugepaged that allows memory to be migrated remotely to nodes
> > that are not allowed by a cpuset's mems or a mempolicy's nodemask. Even
> > with this issue fixed, you may find that some memory is migrated remotely,
> > although it may be negligible, by khugepaged.
>
> A bit here and there is manageable. There is, of course, some work to
> be done there, but for now we're mainly concerned with a job that's
> supposed to be confined to a cpuset spilling out and soaking up all the
> memory on a machine.
>
You may find my patch[*] in -mm to be helpful if you enable
zone_reclaim_mode. It changes khugepaged so that it is not allowed to
migrate any memory to a remote node where the distance between the nodes
is greater than RECLAIM_DISTANCE.
These issues are still pending and we've encountered a couple of them in
the past weeks ourselves. The definition of RECLAIM_DISTANCE, currently
at 30 for x86, is relying on the SLIT to define when remote access is
costly and there are cases where people need to alter the BIOS to
workaround this definition.
We can hope that NUMA balancing will solve a lot of these problems for us,
but there's always a chance that the VM does something totally wrong which
you've undoubtedly encountered already.
[*] http://ozlabs.org/~akpm/mmots/broken-out/mm-thp-only-collapse-hugepages-to-nodes-with-affinity-for-zone_reclaim_mode.patch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists