lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ym03Z7FlgcCpwXCi@debian.me>
Date:   Sat, 30 Apr 2022 20:19:35 +0700
From:   Bagas Sanjaya <bagasdotme@...il.com>
To:     Qi Zheng <zhengqi.arch@...edance.com>
Cc:     akpm@...ux-foundation.org, tglx@...utronix.de,
        kirill.shutemov@...ux.intel.com, mika.penttila@...tfour.com,
        david@...hat.com, jgg@...dia.com, tj@...nel.org, dennis@...nel.org,
        ming.lei@...hat.com, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        songmuchun@...edance.com, zhouchengming@...edance.com
Subject: Re: [RFC PATCH 18/18] Documentation: add document for pte_ref

Hi Qi,

On Fri, Apr 29, 2022 at 09:35:52PM +0800, Qi Zheng wrote:
> +Now in order to pursue high performance, applications mostly use some
> +high-performance user-mode memory allocators, such as jemalloc or tcmalloc.
> +These memory allocators use madvise(MADV_DONTNEED or MADV_FREE) to release
> +physical memory for the following reasons::
> +
> + First of all, we should hold as few write locks of mmap_lock as possible,
> + since the mmap_lock semaphore has long been a contention point in the
> + memory management subsystem. The mmap()/munmap() hold the write lock, and
> + the madvise(MADV_DONTNEED or MADV_FREE) hold the read lock, so using
> + madvise() instead of munmap() to released physical memory can reduce the
> + competition of the mmap_lock.
> +
> + Secondly, after using madvise() to release physical memory, there is no
> + need to build vma and allocate page tables again when accessing the same
> + virtual address again, which can also save some time.
> +

I think we can use enumerated list, like below:

-- >8 --

diff --git a/Documentation/vm/pte_ref.rst b/Documentation/vm/pte_ref.rst
index 0ac1e5a408d7c6..67b18e74fcb367 100644
--- a/Documentation/vm/pte_ref.rst
+++ b/Documentation/vm/pte_ref.rst
@@ -10,18 +10,18 @@ Preface
 Now in order to pursue high performance, applications mostly use some
 high-performance user-mode memory allocators, such as jemalloc or tcmalloc.
 These memory allocators use madvise(MADV_DONTNEED or MADV_FREE) to release
-physical memory for the following reasons::
-
- First of all, we should hold as few write locks of mmap_lock as possible,
- since the mmap_lock semaphore has long been a contention point in the
- memory management subsystem. The mmap()/munmap() hold the write lock, and
- the madvise(MADV_DONTNEED or MADV_FREE) hold the read lock, so using
- madvise() instead of munmap() to released physical memory can reduce the
- competition of the mmap_lock.
-
- Secondly, after using madvise() to release physical memory, there is no
- need to build vma and allocate page tables again when accessing the same
- virtual address again, which can also save some time.
+physical memory for the following reasons:
+
+1. We should hold as few write locks of mmap_lock as possible,
+   since the mmap_lock semaphore has long been a contention point in the
+   memory management subsystem. The mmap()/munmap() hold the write lock, and
+   the madvise(MADV_DONTNEED or MADV_FREE) hold the read lock, so using
+   madvise() instead of munmap() to released physical memory can reduce the
+   competition of the mmap_lock.
+
+2. After using madvise() to release physical memory, there is no
+   need to build vma and allocate page tables again when accessing the same
+   virtual address again, which can also save some time.
 
 The following is the largest user PTE page table memory that can be
 allocated by a single user process in a 32-bit and a 64-bit system.

> +The following is the largest user PTE page table memory that can be
> +allocated by a single user process in a 32-bit and a 64-bit system.
> +

We can say "assuming 4K page size" here,

> ++---------------------------+--------+---------+
> +|                           | 32-bit | 64-bit  |
> ++===========================+========+=========+
> +| user PTE page table pages | 3 MiB  | 512 GiB |
> ++---------------------------+--------+---------+
> +| user PMD page table pages | 3 KiB  | 1 GiB   |
> ++---------------------------+--------+---------+
> +
> +(for 32-bit, take 3G user address space, 4K page size as an example;
> + for 64-bit, take 48-bit address width, 4K page size as an example.)
> +

... instead of here.

> +There is also a lock-less scenario(such as fast GUP). Fortunately, we don't need
> +to do any additional operations to ensure that the system is in order. Take fast
> +GUP as an example::
> +
> +	thread A		thread B
> +	fast GUP		madvise(MADV_DONTNEED)
> +	========		======================
> +
> +	get_user_pages_fast_only()
> +	--> local_irq_save();
> +				call_rcu(pte_free_rcu)
> +	    gup_pgd_range();
> +	    local_irq_restore();
> +	    			/* do pte_free_rcu() */
> +

I see whitespace warning circa do pte_free_rcu() line above when
applying this series.

Thanks.

-- 
An old man doll... just what I always wanted! - Clara

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ