[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090228001400.GC7174@us.ibm.com>
Date: Fri, 27 Feb 2009 16:14:00 -0800
From: Gary Hade <garyhade@...ibm.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Gary Hade <garyhade@...ibm.com>, roel.kluin@...il.com,
mingo@...e.hu, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
y-goto@...fujitsu.com
Subject: Re: [PATCH] mm: get_nid_for_pfn() returns int
On Fri, Feb 27, 2009 at 01:46:16PM -0800, Andrew Morton wrote:
> On Fri, 27 Feb 2009 13:33:40 -0800
> Gary Hade <garyhade@...ibm.com> wrote:
>
> > On Fri, Feb 27, 2009 at 03:56:40PM +0100, roel kluin wrote:
> > > >> > > get_nid_for_pfn() returns int
> > >
> > > >> > My mistake. __Good catch.
> > >
> > > >> Presumably the (nid < 0) case has never happened.
> > > >
> > > > We do know that it is happening on one system while creating
> > > > a symlink for a memory section so it should also happen on
> > > > the same system if unregister_mem_sect_under_nodes() were
> > > > called to remove the same symlink.
> > > >
> > > > The test was actually added in response to a problem with an
> > > > earlier version reported by Yasunori Goto where one or more
> > > > of the leading pages of a memory section on the 2nd node of
> > > > one of his systems was uninitialized because I believe they
> > > > coincided with a memory hole. __The earlier version did not
> > > > ignore uninitialized pages and determined the nid by considering
> > > > only the 1st page of each memory section. __This caused the
> > > > symlink to the 1st memory section on the 2nd node to be
> > > > incorrectly created in /sys/devices/system/node/node0 instead
> > > > of /sys/devices/system/node/node1. __The problem was fixed by
> > > > adding the test to skip over uninitialized pages.
> > > >
> > > > I suspect we have not seen any reports of the non-removal
> > > > of a symlink due to the incorrect declaration of the nid
> > > > variable in unregister_mem_sect_under_nodes() because
> > > > __- systems where a memory section could have an uninitialized
> > > > __ __range of leading pages are probably rare.
> > > > __- memory remove is probably not done very frequently on the
> > > > __ __systems that are capable of demonstrating the problem.
> > > > __- lingering symlink(s) that should have been removed may
> > > > __ __have simply gone unnoticed.
> > > >>
> > > >> Should we retain the test?
> > > >
> > > > Yes.
> > > >
> > > >>
> > > >> Is silently skipping the node in that case desirable behaviour?
> > > >
> > > > It actually silently skips pages (not nodes) in it's quest
> > > > for valid nids for all the nodes that the memory section scans.
> > > > This is definitely desirable.
> > > >
> > > > I hope this answers your questions.
> > >
> > > This still isn't applied, was it lost?
> >
> > It is still lingering in -mm:
> > http://userweb.kernel.org/~akpm/mmotm/broken-out/mm-get_nid_for_pfn-returns-int.patch
> >
>
> Should it unlinger? I have it in the 2.6.30 pile.
Yes, that would be good. :)
> Does it actually fix a demonstrable bug?
I am not aware of anyone that has actually reproduced the
problem. I do not believe that we have any systems where
it can be reproduced since it would require both
(1) a memory section with an uninitialized range of
pages and
(2) a memory remove event for that memory section.
As far as I know, none of our systems have (1). Yasunori Goto
has a system with (1) but I am not sure if he can do (2).
Gary
--
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503 IBM T/L: 775-4503
garyhade@...ibm.com
http://www.ibm.com/linux/ltc
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists