lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240814035451.773331-1-yuzhao@google.com>
Date: Tue, 13 Aug 2024 21:54:48 -0600
From: Yu Zhao <yuzhao@...gle.com>
To: Andrew Morton <akpm@...ux-foundation.org>, Muchun Song <muchun.song@...ux.dev>
Cc: "Matthew Wilcox (Oracle)" <willy@...radead.org>, Zi Yan <ziy@...dia.com>, linux-mm@...ck.org, 
	linux-kernel@...r.kernel.org, Yu Zhao <yuzhao@...gle.com>
Subject: [PATCH mm-unstable v2 0/3] mm/hugetlb: alloc/free gigantic folios

Use __GFP_COMP for gigantic folios can greatly reduce not only the
amount of code but also the allocation and free time.

Approximate LOC to mm/hugetlb.c: +60, -240

Allocate and free 500 1GB hugeTLB memory without HVO by:
  time echo 500 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
  time echo 0 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages

       Before  After
Alloc  ~13s    ~10s
Free   ~15s    <1s

The above magnitude generally holds for multiple x86 and arm64 CPU
models.

Perf profile before:
  Alloc
    - 99.99% alloc_pool_huge_folio
       - __alloc_fresh_hugetlb_folio
          - 83.23% alloc_contig_pages_noprof
             - 47.46% alloc_contig_range_noprof
                - 20.96% isolate_freepages_range
                     16.10% split_page
                - 14.10% start_isolate_page_range
                - 12.02% undo_isolate_page_range

  Free
    - update_and_free_pages_bulk
       - 87.71% free_contig_range
          - 76.02% free_unref_page
             - 41.30% free_unref_page_commit
                - 32.58% free_pcppages_bulk
                   - 24.75% __free_one_page
               13.96% _raw_spin_trylock
         12.27% __update_and_free_hugetlb_folio

Perf profile after:
  Alloc
    - 99.99% alloc_pool_huge_folio
         alloc_gigantic_folio
       - alloc_contig_pages_noprof
          - 59.15% alloc_contig_range_noprof
             - 20.72% start_isolate_page_range
               20.64% prep_new_page
             - 17.13% undo_isolate_page_range

  Free
    - update_and_free_pages_bulk
       - __folio_put
       - __free_pages_ok
            7.46% free_tail_page_prepare
          - 1.97% free_one_page
               1.86% __free_one_page

Yu Zhao (3):
  mm/contig_alloc: support __GFP_COMP
  mm/cma: add cma_{alloc,free}_folio()
  mm/hugetlb: use __GFP_COMP for gigantic folios

 include/linux/cma.h     |  16 +++
 include/linux/gfp.h     |  23 ++++
 include/linux/hugetlb.h |   9 +-
 mm/cma.c                |  55 ++++++--
 mm/compaction.c         |  41 +-----
 mm/hugetlb.c            | 293 ++++++++--------------------------------
 mm/page_alloc.c         | 111 ++++++++++-----
 7 files changed, 226 insertions(+), 322 deletions(-)

-- 
2.46.0.76.ge559c4bf1a-goog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ