[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140214001438.GB1651@linux.vnet.ibm.com>
Date: Thu, 13 Feb 2014 16:14:38 -0800
From: Nishanth Aravamudan <nacc@...ux.vnet.ibm.com>
To: David Rientjes <rientjes@...gle.com>
Cc: Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Fengguang Wu <fengguang.wu@...el.com>,
David Cohen <david.a.cohen@...ux.intel.com>,
Al Viro <viro@...iv.linux.org.uk>,
Damien Ramonda <damien.ramonda@...el.com>,
Jan Kara <jack@...e.cz>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH V5] mm readahead: Fix readahead fail for no local
memory and limit readahead pages
On 13.02.2014 [14:41:04 -0800], David Rientjes wrote:
> On Thu, 13 Feb 2014, Raghavendra K T wrote:
>
> > Thanks David, unfortunately even after applying that patch, I do not see
> > the improvement.
> >
> > Interestingly numa_mem_id() seem to still return the value of a
> > memoryless node.
> > May be per cpu _numa_mem_ values are not set properly. Need to dig out ....
> >
>
> I believe ppc will be relying on __build_all_zonelists() to set
> numa_mem_id() to be the proper node, and that relies on the ordering of
> the zonelist built for the memoryless node. It would be very strange if
> local_memory_node() is returning a memoryless node because it is the first
> zone for node_zonelist(GFP_KERNEL) (why would a memoryless node be on the
> zonelist at all?).
>
> I think the real problem is that build_all_zonelists() is only called at
> init when the boot cpu is online so it's only setting numa_mem_id()
> properly for the boot cpu. Does it return a node with memory if you
> toggle /proc/sys/vm/numa_zonelist_order? Do
>
> echo node > /proc/sys/vm/numa_zonelist_order
> echo zone > /proc/sys/vm/numa_zonelist_order
> echo default > /proc/sys/vm/numa_zonelist_order
>
> and check if it returns the proper value at either point. This will force
> build_all_zonelists() and numa_mem_id() to point to the proper node since
> all cpus are now online.
>
> So the prerequisite for CONFIG_HAVE_MEMORYLESS_NODES is that there is an
> arch-specific set_numa_mem() that makes this mapping correct like ia64
> does. If that's the case, then it's (1) completely undocumented and (2)
> Nishanth's patch is incomplete because anything that adds
> CONFIG_HAVE_MEMORYLESS_NODES needs to do the proper set_numa_mem() for it
> to be any different than numa_node_id().
I'm working on this latter bit now. I tried to mirror ia64, but it looks
like they have CONFIG_USER_PERCPU_NUMA_NODE_ID, which powerpc doesn't.
It seems like CONFIG_USER_PERCPU_NUMA_NODE_ID and
CONFIG_HAVE_MEMORYLESS_NODES should be tied together in Kconfig?
I'll keep working, but would appreciate any further insight.
-Nish
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists