lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAE9FiQUjVRUs02-ymmtO+5+SgqTWK8Ae6jJwD08uRbgR=eLJgw@mail.gmail.com>
Date:	Sat, 23 Mar 2013 13:37:37 -0700
From:	Yinghai Lu <yinghai@...nel.org>
To:	Russ Anderson <rja@....com>, Tejun Heo <tj@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Ingo Molnar <mingo@...nel.org>,
	David Rientjes <rientjes@...gle.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, tglx@...utronix.de, mingo@...hat.com,
	hpa@...or.com
Subject: Re: [patch] mm: speedup in __early_pfn_to_nid

On Sat, Mar 23, 2013 at 8:29 AM, Russ Anderson <rja@....com> wrote:
> On Fri, Mar 22, 2013 at 08:25:32AM +0100, Ingo Molnar wrote:
> ------------------------------------------------------------
> When booting on a large memory system, the kernel spends
> considerable time in memmap_init_zone() setting up memory zones.
> Analysis shows significant time spent in __early_pfn_to_nid().
>
> The routine memmap_init_zone() checks each PFN to verify the
> nid is valid.  __early_pfn_to_nid() sequentially scans the list of
> pfn ranges to find the right range and returns the nid.  This does
> not scale well.  On a 4 TB (single rack) system there are 308
> memory ranges to scan.  The higher the PFN the more time spent
> sequentially spinning through memory ranges.
>
> Since memmap_init_zone() increments pfn, it will almost always be
> looking for the same range as the previous pfn, so check that
> range first.  If it is in the same range, return that nid.
> If not, scan the list as before.
>
> A 4 TB (single rack) UV1 system takes 512 seconds to get through
> the zone code.  This performance optimization reduces the time
> by 189 seconds, a 36% improvement.
>
> A 2 TB (single rack) UV2 system goes from 212.7 seconds to 99.8 seconds,
> a 112.9 second (53%) reduction.

Interesting. but only have 308 entries in memblock...

Did you try to extend memblock_search() to search nid back?
Something like attached patch. That should save more time.

>
> Signed-off-by: Russ Anderson <rja@....com>
> ---
>  arch/ia64/mm/numa.c |   15 ++++++++++++++-
>  mm/page_alloc.c     |   15 ++++++++++++++-
>  2 files changed, 28 insertions(+), 2 deletions(-)
>
> Index: linux/mm/page_alloc.c
> ===================================================================
> --- linux.orig/mm/page_alloc.c  2013-03-19 16:09:03.736450861 -0500
> +++ linux/mm/page_alloc.c       2013-03-22 17:07:43.895405617 -0500
> @@ -4161,10 +4161,23 @@ int __meminit __early_pfn_to_nid(unsigne
>  {
>         unsigned long start_pfn, end_pfn;
>         int i, nid;
> +       /*
> +          NOTE: The following SMP-unsafe globals are only used early
> +          in boot when the kernel is running single-threaded.
> +        */
> +       static unsigned long last_start_pfn, last_end_pfn;
> +       static int last_nid;
> +
> +       if (last_start_pfn <= pfn && pfn < last_end_pfn)
> +               return last_nid;
>
>         for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid)
> -               if (start_pfn <= pfn && pfn < end_pfn)
> +               if (start_pfn <= pfn && pfn < end_pfn) {
> +                       last_start_pfn = start_pfn;
> +                       last_end_pfn = end_pfn;
> +                       last_nid = nid;
>                         return nid;
> +               }
>         /* This is a memory hole */
>         return -1;
>  }
> Index: linux/arch/ia64/mm/numa.c
> ===================================================================
> --- linux.orig/arch/ia64/mm/numa.c      2013-02-25 15:49:44.000000000 -0600
> +++ linux/arch/ia64/mm/numa.c   2013-03-22 16:09:44.662268239 -0500
> @@ -61,13 +61,26 @@ paddr_to_nid(unsigned long paddr)
>  int __meminit __early_pfn_to_nid(unsigned long pfn)
>  {
>         int i, section = pfn >> PFN_SECTION_SHIFT, ssec, esec;
> +       /*
> +          NOTE: The following SMP-unsafe globals are only used early
> +          in boot when the kernel is running single-threaded.
> +       */
> +       static unsigned long last_start_pfn, last_end_pfn;

last_ssec, last_esec?


> +       static int last_nid;
> +
> +       if (section >= last_ssec && section < last_esec)
> +               return last_nid;
>
>         for (i = 0; i < num_node_memblks; i++) {
>                 ssec = node_memblk[i].start_paddr >> PA_SECTION_SHIFT;
>                 esec = (node_memblk[i].start_paddr + node_memblk[i].size +
>                         ((1L << PA_SECTION_SHIFT) - 1)) >> PA_SECTION_SHIFT;
> -               if (section >= ssec && section < esec)
> +               if (section >= ssec && section < esec) {
> +                       last_ssec = ssec;
> +                       last_esec = esec;
> +                       last_nid = node_memblk[i].nid
>                         return node_memblk[i].nid;
> +               }
>         }
>
>         return -1;
>

also looks like you forget to put IA maintainers in the To list.

may just put ia64 part in separated patch?

Thanks

Yinghai

Download attachment "memblock_search_pfn_nid.patch" of type "application/octet-stream" (2370 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ