lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f9ccf33b-c81c-6b25-6471-80c600f06732@bytedance.com>
Date:   Sat, 21 May 2022 16:59:19 +0800
From:   Qi Zheng <zhengqi.arch@...edance.com>
To:     Chih-En Lin <shiyn.lin@...il.com>,
        David Hildenbrand <david@...hat.com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Christian Brauner <brauner@...nel.org>,
        "Matthew Wilcox (Oracle)" <willy@...radead.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        William Kucharski <william.kucharski@...cle.com>,
        John Hubbard <jhubbard@...dia.com>,
        Yunsheng Lin <linyunsheng@...wei.com>,
        Arnd Bergmann <arnd@...db.de>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Colin Cross <ccross@...gle.com>,
        Feng Tang <feng.tang@...el.com>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Mike Rapoport <rppt@...nel.org>,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        Anshuman Khandual <anshuman.khandual@....com>,
        "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
        Daniel Axtens <dja@...ens.net>,
        Jonathan Marek <jonathan@...ek.ca>,
        Christophe Leroy <christophe.leroy@...roup.eu>,
        Pasha Tatashin <pasha.tatashin@...een.com>,
        Peter Xu <peterx@...hat.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Andy Lutomirski <luto@...nel.org>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Fenghua Yu <fenghua.yu@...el.com>,
        linux-kernel@...r.kernel.org, Kaiyang Zhao <zhao776@...due.edu>,
        Huichun Feng <foxhoundsk.tw@...il.com>,
        Jim Huang <jserv.tw@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org
Subject: Re: [External] [RFC PATCH 0/6] Introduce Copy-On-Write to Page Table



On 2022/5/20 2:31 AM, Chih-En Lin wrote:
> When creating the user process, it usually uses the Copy-On-Write (COW)
> mechanism to save the memory usage and the cost of time for copying.
> COW defers the work of copying private memory and shares it across the
> processes as read-only. If either process wants to write in these
> memories, it will page fault and copy the shared memory, so the process
> will now get its private memory right here, which is called break COW.
> 
> Presently this kind of technology is only used as the mapping memory.
> It still needs to copy the entire page table from the parent.
> It might cost a lot of time and memory to copy each page table when the
> parent already has a lot of page tables allocated. For example, here is
> the state table for mapping the 1 GB memory of forking.
> 
> 	    mmap before fork         mmap after fork
> MemTotal:       32746776 kB             32746776 kB
> MemFree:        31468152 kB             31463244 kB
> AnonPages:       1073836 kB              1073628 kB
> Mapped:            39520 kB                39992 kB
> PageTables:         3356 kB                 5432 kB
> 
> This patch introduces Copy-On-Write to the page table. This patch only
> implements the COW on the PTE level. It's based on the paper
> On-Demand Fork [1]. Summary of the implementation for the paper:
> 
> - Only implements the COW to the anonymous mapping
> - Only do COW to the PTE table which the range is all covered by a
>    single VMA.
> - Use the reference count to control the COW PTE table lifetime.
>    Decrease the counter when breaking COW or dereference the COW PTE
>    table. When the counter reduces to zero, free the PTE table.
> 

Hi,

To reduce the empty user PTE tables, I also introduced a reference
count (pte_ref) for user PTE tables in my patch[1][2], It is used
to track the usage of each user PTE tables.

The following people will hold a pte_ref:
  - The !pte_none() entry, such as regular page table entry that map
    physical pages, or swap entry, or migrate entry, etc.
  - Visitor to the PTE page table entries, such as page table walker.

With COW PTE, a new holder (the process using the COW PTE) is added.

It's funny, it leads me to see more meaning of pte_ref.

Thanks,
Qi

[1] [RFC PATCH 00/18] Try to free user PTE page table pages
     link: 
https://lore.kernel.org/lkml/20220429133552.33768-1-zhengqi.arch@bytedance.com/
     (percpu_ref version)

[2] [PATCH v3 00/15] Free user PTE page table pages
     link: 
https://lore.kernel.org/lkml/20211110105428.32458-1-zhengqi.arch@bytedance.com/
     (atomic count version)

-- 
Thanks,
Qi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ