lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 19 Apr 2011 14:49:40 -0500
From:	James Bottomley <James.Bottomley@...senPartnership.com>
To:	Christoph Lameter <cl@...ux.com>
Cc:	Pekka Enberg <penberg@...nel.org>, Michal Hocko <mhocko@...e.cz>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Hugh Dickins <hughd@...gle.com>, linux-mm@...ck.org,
	LKML <linux-kernel@...r.kernel.org>,
	linux-parisc@...r.kernel.org, David Rientjes <rientjes@...gle.com>
Subject: Re: [PATCH v3] mm: make expand_downwards symmetrical to
 expand_upwards

On Tue, 2011-04-19 at 13:35 -0500, Christoph Lameter wrote:
> On Tue, 19 Apr 2011, James Bottomley wrote:
> 
> > > }
> > >
> > > How in the world did you get a zone setup in node 1 with a !NUMA config?
> >
> > I told you ... I forced an allocation into the first discontiguous
> > region.  That will return 1 for page_to_nid().
> 
> How? The kernel has no concept of a node 1 without CONFIG_NUMA and so you
> cannot tell the page allocator to allocate from node 1.

Yes, it does, as I explained in the email.

> zone_to_nid is used as a fallback mechanism for page_to_nid() and as shown
> will always return 0 for !NUMA configs.
> 
> page_to_nid(x) == zone_to_nid(page_zone(x)) must hold true. It is not
> here.
> 
> > > The problem seems to be that the kernel seems to allow a
> > > definition of a page_to_nid() function that returns non zero in the !NUMA
> > > case.
> >
> > This is called reality, yes.
> 
> There you have the bug. Fix that and things will work fine.

Why don't yout file the bug against reality? I'm not sure I have enough
credibility ...

> > right, that's what I told you: slub is broken because it's making a
> > wrong assumption.  Look in asm-generic/memory_model.h it shows how the
> > page_to_nid() is used in finding the pfn array.  DISCONTIGMEM uses some
> > of the numa properties (including assigning zones to the discontiguous
> > regions).
> 
> Bitrotted code?

Don't be silly: alpha, ia64, m32r, m68k, mips, parisc, tile and even x86
all use the discontigmem memory model in some configurations.

>  If it uses numa properties then it must use a zone field
> in struct zone. So DISCONTIGMEM seems to require CONFIG_NUMA.

No ... you're giving me back your assumptions.  They're not based on
what the kernel does.  CONFIG_NUMA may or may not be defined with
CONFIG_DISCONTIGMEM.

Of all the above, only x86 always had NUMA with DISCONTIGMEM.

> > > If you think that is broken then we have brokenness all over the kernel
> > > whenever we determine the node from a page and use that to do a lookup.
> >
> > Not really.  The rest of the kernel uses the proper macros.  in
> > DISCONTIGMEM but !NUMA configs, the numa macros expand correctly.
> > You've cut across that with all the CONFIG_NUMA checks in slub.
> 
> What are "the proper macros"? AFAICT page_to_nid() is the proper way to
> access the node of a page. If page_to_nid() returns 1 then you have a zone
> that the kernel knows of as being in node 0 having a page on a different
> node.

Well it depends what you want.  If you only want the actual NUMA node,
then pfn_to_nid() probably isn't what you want, because in a
DISCONTIGMEM model, there may be multiple nids per actual numa node.

> We can likely force page_to_nid to ignore the node information that have
> been erroneously placed there but this looks like something deeper is
> wrong here. The node field in struct page is not only used for the Linux
> support of a NUMA node but also for blocks of memory. Those should be
> separate things.

Look, it's not wrong, it's by design.  The assumption that non-numa
systems don't use nodes is the wrong one.

> ---
>  include/linux/mm.h |    4 ++++
>  1 file changed, 4 insertions(+)
> 
> Index: linux-2.6/include/linux/mm.h
> ===================================================================
> --- linux-2.6.orig/include/linux/mm.h	2011-04-19 13:20:20.092521248 -0500
> +++ linux-2.6/include/linux/mm.h	2011-04-19 13:21:05.962521196 -0500
> @@ -665,6 +665,7 @@ static inline int zone_to_nid(struct zon
>  #endif
>  }
> 
> +#ifdef CONFIG_NUMA
>  #ifdef NODE_NOT_IN_PAGE_FLAGS
>  extern int page_to_nid(struct page *page);
>  #else
> @@ -673,6 +674,9 @@ static inline int page_to_nid(struct pag
>  	return (page->flags >> NODES_PGSHIFT) & NODES_MASK;
>  }
>  #endif
> +#else
> +#define page_to_nid(x) 0
> +#endif

Don't be silly ... that breaks asm-generic/memory_model.h

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ