linux-kernel - Re: [PATCH 2/2] mm: numa: Do not clear PTEs or PMDs for NUMA hinting faults

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20150306035916.GD4251@dastard>
Date:	Fri, 6 Mar 2015 14:59:16 +1100
From:	Dave Chinner <david@...morbit.com>
To:	Mel Gorman <mgorman@...e.de>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...nel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Aneesh Kumar <aneesh.kumar@...ux.vnet.ibm.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Linux-MM <linux-mm@...ck.org>, xfs@....sgi.com,
	linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH 2/2] mm: numa: Do not clear PTEs or PMDs for NUMA hinting
 faults

On Thu, Mar 05, 2015 at 11:54:52PM +0000, Mel Gorman wrote:
> Dave Chinner reported the following on https://lkml.org/lkml/2015/3/1/226
> 
>    Across the board the 4.0-rc1 numbers are much slower, and the
>    degradation is far worse when using the large memory footprint
>    configs. Perf points straight at the cause - this is from 4.0-rc1
>    on the "-o bhash=101073" config:
> 
>    -   56.07%    56.07%  [kernel]            [k] default_send_IPI_mask_sequence_phys
>       - default_send_IPI_mask_sequence_phys
>          - 99.99% physflat_send_IPI_mask
>             - 99.37% native_send_call_func_ipi
>                  smp_call_function_many
>                - native_flush_tlb_others
>                   - 99.85% flush_tlb_page
>                        ptep_clear_flush
>                        try_to_unmap_one
>                        rmap_walk
>                        try_to_unmap
>                        migrate_pages
>                        migrate_misplaced_page
>                      - handle_mm_fault
>                         - 99.73% __do_page_fault
>                              trace_do_page_fault
>                              do_async_page_fault
>                            + async_page_fault
>               0.63% native_send_call_func_single_ipi
>                  generic_exec_single
>                  smp_call_function_single
> 
> This was bisected to commit 4d9424669946 ("mm: convert p[te|md]_mknonnuma
> and remaining page table manipulations") which clears PTEs and PMDs to make
> them PROT_NONE. This is tidy but tests on some benchmarks indicate that
> there are many more hinting faults trapped resulting in excessive migration.
> This is the result for the old autonuma benchmark for example.

[snip]

Doesn't fix the problem. Runtime is slightly improved (16m45s vs 17m35)
but it's still much slower that 3.19 (6m5s).

Stats and profiles still roughly the same:

	360,228      migrate:mm_migrate_pages     ( +-  4.28% )

-   52.69%    52.69%  [kernel]            [k] default_send_IPI_mask_sequence_phys
     default_send_IPI_mask_sequence_phys
   - physflat_send_IPI_mask
      - 97.28% native_send_call_func_ipi
           smp_call_function_many
           native_flush_tlb_others
           flush_tlb_page
           ptep_clear_flush
           try_to_unmap_one
           rmap_walk
           try_to_unmap
           migrate_pages
           migrate_misplaced_page
         - handle_mm_fault
            - 99.59% __do_page_fault
                 trace_do_page_fault
                 do_async_page_fault
               + async_page_fault
      + 2.72% native_send_call_func_single_ipi

numa_hit 36678767
numa_miss 905234
numa_foreign 905234
numa_interleave 14802
numa_local 36656791
numa_other 927210
numa_pte_updates 92168450
numa_huge_pte_updates 0
numa_hint_faults 87573926
numa_hint_faults_local 29730293
numa_pages_migrated 30195890
pgmigrate_success 30195890
pgmigrate_fail 0

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/