lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131217201147.GH21724@cmpxchg.org>
Date:	Tue, 17 Dec 2013 15:11:47 -0500
From:	Johannes Weiner <hannes@...xchg.org>
To:	Mel Gorman <mgorman@...e.de>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Dave Hansen <dave.hansen@...el.com>,
	Rik van Riel <riel@...hat.com>,
	Linux-MM <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 3/7] mm: page_alloc: Use zone node IDs to approximate
 locality

On Tue, Dec 17, 2013 at 04:08:08PM +0000, Mel Gorman wrote:
> On Tue, Dec 17, 2013 at 10:38:29AM -0500, Johannes Weiner wrote:
> > On Tue, Dec 17, 2013 at 11:13:52AM +0000, Mel Gorman wrote:
> > > On Mon, Dec 16, 2013 at 03:25:07PM -0500, Johannes Weiner wrote:
> > > > On Fri, Dec 13, 2013 at 02:10:03PM +0000, Mel Gorman wrote:
> > > > > zone_local is using node_distance which is a more expensive call than
> > > > > necessary. On x86, it's another function call in the allocator fast path
> > > > > and increases cache footprint. This patch makes the assumption zones on a
> > > > > local node will share the same node ID. The necessary information should
> > > > > already be cache hot.
> > > > > 
> > > > > Signed-off-by: Mel Gorman <mgorman@...e.de>
> > > > > ---
> > > > >  mm/page_alloc.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > > > index 64020eb..fd9677e 100644
> > > > > --- a/mm/page_alloc.c
> > > > > +++ b/mm/page_alloc.c
> > > > > @@ -1816,7 +1816,7 @@ static void zlc_clear_zones_full(struct zonelist *zonelist)
> > > > >  
> > > > >  static bool zone_local(struct zone *local_zone, struct zone *zone)
> > > > >  {
> > > > > -	return node_distance(local_zone->node, zone->node) == LOCAL_DISTANCE;
> > > > > +	return zone_to_nid(zone) == numa_node_id();
> > > > 
> > > > Why numa_node_id()?  We pass in the preferred zone as @local_zone:
> > > > 
> > > 
> > > Initially because I was thinking "local node" and numa_node_id() is a
> > > per-cpu variable that should be cheap to access and in some cases
> > > cache-hot as the top-level gfp API calls numa_node_id().
> > > 
> > > Thinking about it more though it still makes sense because the preferred
> > > zone is not necessarily local. If the allocation request requires ZONE_DMA32
> > > and the local node does not have that zone then preferred zone is on a
> > > remote node.
> > 
> > Don't we treat everything in relation to the preferred zone?
> 
> Usually yes, but this time we really care about whether the memory is
> local or remote. It makes sense to me as it is and struggle to see an
> advantage of expressing it in terms of the preferred zone. Minimally
> zone_local would need to be renamed if it could return true for a remote
> zone and I see no advantage in doing that.

What the function tests for is whether any given zone is close
enough/local to the given preferred zone such that we can allocate
from it without having to invoke zone_reclaim_mode.

In your example, if the preferred DMA32 zone were to be on a remote
node and eligible for allocation but full, a DMA zone on the same node
should be fine as well and would not impose a higher remote reference
burden on the allocator than allocating from the preferred DMA32 zone.

So it's really not about the locality of the allocating task but about
the locality of the given preferred zone.

In my tree, I replaced the function body with

	return local_zone->node == zone->node;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ