lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <cover.1602093760.git.yuleixzhang@tencent.com>
Date:   Thu,  8 Oct 2020 15:53:50 +0800
From:   yulei.kernel@...il.com
To:     akpm@...ux-foundation.org, naoya.horiguchi@....com,
        viro@...iv.linux.org.uk, pbonzini@...hat.com
Cc:     linux-fsdevel@...r.kernel.org, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, xiaoguangrong.eric@...il.com,
        kernellwp@...il.com, lihaiwei.kernel@...il.com,
        Yulei Zhang <yuleixzhang@...cent.com>
Subject: [PATCH 00/35] Enhance memory utilization with DMEMFS

From: Yulei Zhang <yuleixzhang@...cent.com>

In current system each physical memory page is assocaited with
a page structure which is used to track the usage of this page.
But due to the memory usage rapidly growing in cloud environment,
we find the resource consuming for page structure storage becomes
highly remarkable. So is it an expense that we could spare?

This patchset introduces an idea about how to save the extra
memory through a new virtual filesystem -- dmemfs.

Dmemfs (Direct Memory filesystem) is device memory or reserved
memory based filesystem. This kind of memory is special as it
is not managed by kernel and most important it is without 'struct page'.
Therefore we can leverage the extra memory from the host system
to support more tenants in our cloud service.

We uses a kernel boot parameter 'dmem=' to reserve the system
memory when the host system boots up, the details can be checked
in /Documentation/admin-guide/kernel-parameters.txt. 

Theoretically for each 4k physical page it can save 64 bytes if
we drop the 'struct page', so for guest memory with 320G it can
save about 5G physical memory totally. 

Detailed usage of dmemfs is included in
/Documentation/filesystem/dmemfs.rst.

TODO:
1. we temporary disable the record_steal_time() before entering
guest, will enable that after solve the conflict.
2. working on systemcall such as mincore, will update the status
and patches soon. 

Yulei Zhang (35):
  fs: introduce dmemfs module
  mm: support direct memory reservation
  dmem: implement dmem memory management
  dmem: let pat recognize dmem
  dmemfs: support mmap
  dmemfs: support truncating inode down
  dmem: trace core functions
  dmem: show some statistic in debugfs
  dmemfs: support remote access
  dmemfs: introduce max_alloc_try_dpages parameter
  mm: export mempolicy interfaces to serve dmem allocator
  dmem: introduce mempolicy support
  mm, dmem: introduce PFN_DMEM and pfn_t_dmem
  mm, dmem: dmem-pmd vs thp-pmd
  mm: add pmd_special() check for pmd_trans_huge_lock()
  dmemfs: introduce ->split() to dmemfs_vm_ops
  mm, dmemfs: support unmap_page_range() for dmemfs pmd
  mm: follow_pmd_mask() for dmem huge pmd
  mm: gup_huge_pmd() for dmem huge pmd
  mm: support dmem huge pmd for vmf_insert_pfn_pmd()
  mm: support dmem huge pmd for follow_pfn()
  kvm, x86: Distinguish dmemfs page from mmio page
  kvm, x86: introduce VM_DMEM
  dmemfs: support hugepage for dmemfs
  mm, x86, dmem: fix estimation of reserved page for vaddr_get_pfn()
  mm, dmem: introduce pud_special()
  mm: add pud_special() to support dmem huge pud
  mm, dmemfs: support huge_fault() for dmemfs
  mm: add follow_pte_pud()
  dmem: introduce dmem_bitmap_alloc() and dmem_bitmap_free()
  dmem: introduce mce handler
  mm, dmemfs: register and handle the dmem mce
  kvm, x86: temporary disable record_steal_time for dmem
  dmem: add dmem unit tests
  Add documentation for dmemfs

 .../admin-guide/kernel-parameters.txt         |   38 +
 Documentation/filesystems/dmemfs.rst          |   59 +
 arch/x86/Kconfig                              |    1 +
 arch/x86/include/asm/pgtable.h                |   32 +-
 arch/x86/include/asm/pgtable_types.h          |   13 +-
 arch/x86/kernel/setup.c                       |    3 +
 arch/x86/kvm/mmu/mmu.c                        |    5 +-
 arch/x86/kvm/x86.c                            |    2 +
 arch/x86/mm/pat/memtype.c                     |   21 +
 drivers/vfio/vfio_iommu_type1.c               |    4 +
 fs/Kconfig                                    |    1 +
 fs/Makefile                                   |    1 +
 fs/dmemfs/Kconfig                             |   16 +
 fs/dmemfs/Makefile                            |    8 +
 fs/dmemfs/inode.c                             | 1063 ++++++++++++++++
 fs/dmemfs/trace.h                             |   54 +
 fs/inode.c                                    |    6 +
 include/linux/dmem.h                          |   49 +
 include/linux/fs.h                            |    1 +
 include/linux/huge_mm.h                       |    5 +-
 include/linux/mempolicy.h                     |    3 +
 include/linux/mm.h                            |    9 +
 include/linux/pfn_t.h                         |   17 +-
 include/linux/pgtable.h                       |   22 +
 include/trace/events/dmem.h                   |   85 ++
 include/uapi/linux/magic.h                    |    1 +
 mm/Kconfig                                    |   21 +
 mm/Makefile                                   |    1 +
 mm/dmem.c                                     | 1075 +++++++++++++++++
 mm/dmem_reserve.c                             |  303 +++++
 mm/gup.c                                      |   94 +-
 mm/huge_memory.c                              |   19 +-
 mm/memory-failure.c                           |   69 +-
 mm/memory.c                                   |   74 +-
 mm/mempolicy.c                                |    4 +-
 mm/mprotect.c                                 |    7 +-
 mm/mremap.c                                   |    3 +
 tools/testing/dmem/Kbuild                     |    1 +
 tools/testing/dmem/Makefile                   |   10 +
 tools/testing/dmem/dmem-test.c                |  184 +++
 40 files changed, 3336 insertions(+), 48 deletions(-)
 create mode 100644 Documentation/filesystems/dmemfs.rst
 create mode 100644 fs/dmemfs/Kconfig
 create mode 100644 fs/dmemfs/Makefile
 create mode 100644 fs/dmemfs/inode.c
 create mode 100644 fs/dmemfs/trace.h
 create mode 100644 include/linux/dmem.h
 create mode 100644 include/trace/events/dmem.h
 create mode 100644 mm/dmem.c
 create mode 100644 mm/dmem_reserve.c
 create mode 100644 tools/testing/dmem/Kbuild
 create mode 100644 tools/testing/dmem/Makefile
 create mode 100644 tools/testing/dmem/dmem-test.c

-- 
2.28.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ