lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 31 May 2023 16:48:52 -0300
From:   Jason Gunthorpe <jgg@...pe.ca>
To:     Hugh Dickins <hughd@...gle.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Mike Rapoport <rppt@...nel.org>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Matthew Wilcox <willy@...radead.org>,
        David Hildenbrand <david@...hat.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Qi Zheng <zhengqi.arch@...edance.com>,
        Yang Shi <shy828301@...il.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Peter Xu <peterx@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Will Deacon <will@...nel.org>, Yu Zhao <yuzhao@...gle.com>,
        Alistair Popple <apopple@...dia.com>,
        Ralph Campbell <rcampbell@...dia.com>,
        Ira Weiny <ira.weiny@...el.com>,
        Steven Price <steven.price@....com>,
        SeongJae Park <sj@...nel.org>,
        Naoya Horiguchi <naoya.horiguchi@....com>,
        Christophe Leroy <christophe.leroy@...roup.eu>,
        Zack Rusin <zackr@...are.com>,
        Axel Rasmussen <axelrasmussen@...gle.com>,
        Anshuman Khandual <anshuman.khandual@....com>,
        Pasha Tatashin <pasha.tatashin@...een.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Minchan Kim <minchan@...nel.org>,
        Christoph Hellwig <hch@...radead.org>,
        Song Liu <song@...nel.org>,
        Thomas Hellstrom <thomas.hellstrom@...ux.intel.com>,
        Russell King <linux@...linux.org.uk>,
        "David S. Miller" <davem@...emloft.net>,
        Michael Ellerman <mpe@...erman.id.au>,
        "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
        Heiko Carstens <hca@...ux.ibm.com>,
        Christian Borntraeger <borntraeger@...ux.ibm.com>,
        Claudio Imbrenda <imbrenda@...ux.ibm.com>,
        Alexander Gordeev <agordeev@...ux.ibm.com>,
        Jann Horn <jannh@...gle.com>,
        linux-arm-kernel@...ts.infradead.org, sparclinux@...r.kernel.org,
        linuxppc-dev@...ts.ozlabs.org, linux-s390@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 08/12] mm/pgtable: add pte_free_defer() for pgtable as
 page

On Sun, May 28, 2023 at 11:23:47PM -0700, Hugh Dickins wrote:
> Add the generic pte_free_defer(), to call pte_free() via call_rcu().
> pte_free_defer() will be called inside khugepaged's retract_page_tables()
> loop, where allocating extra memory cannot be relied upon.  This version
> suits all those architectures which use an unfragmented page for one page
> table (none of whose pte_free()s use the mm arg which was passed to it).
> 
> Signed-off-by: Hugh Dickins <hughd@...gle.com>
> ---
>  include/linux/pgtable.h |  2 ++
>  mm/pgtable-generic.c    | 20 ++++++++++++++++++++
>  2 files changed, 22 insertions(+)
> 
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index 8b0fc7fdc46f..62a8732d92f0 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -112,6 +112,8 @@ static inline void pte_unmap(pte_t *pte)
>  }
>  #endif
>  
> +void pte_free_defer(struct mm_struct *mm, pgtable_t pgtable);
> +
>  /* Find an entry in the second-level page table.. */
>  #ifndef pmd_offset
>  static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
> diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
> index d28b63386cef..471697dcb244 100644
> --- a/mm/pgtable-generic.c
> +++ b/mm/pgtable-generic.c
> @@ -13,6 +13,7 @@
>  #include <linux/swap.h>
>  #include <linux/swapops.h>
>  #include <linux/mm_inline.h>
> +#include <asm/pgalloc.h>
>  #include <asm/tlb.h>
>  
>  /*
> @@ -230,6 +231,25 @@ pmd_t pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long address,
>  	return pmd;
>  }
>  #endif
> +
> +/* arch define pte_free_defer in asm/pgalloc.h for its own implementation */
> +#ifndef pte_free_defer
> +static void pte_free_now(struct rcu_head *head)
> +{
> +	struct page *page;
> +
> +	page = container_of(head, struct page, rcu_head);
> +	pte_free(NULL /* mm not passed and not used */, (pgtable_t)page);
> +}
> +
> +void pte_free_defer(struct mm_struct *mm, pgtable_t pgtable)
> +{
> +	struct page *page;
> +
> +	page = pgtable;
> +	call_rcu(&page->rcu_head, pte_free_now);

People have told me that we can't use the rcu_head on the struct page
backing page table blocks. I understood it was because PPC was using
that memory for something else.

I was hoping Mathew's folio conversion would help clarify this..

On the flip side, if we are able to use rcu_head here then we should
use it everywhere and also use it mmu_gather.c instead of allocating
memory and having the smp_call_function() fallback. This would fix it
to be actual RCU.

There have been a few talks that it sure would be nice if the page
tables were always freed via RCU and every arch just turns on
CONFIG_MMU_GATHER_RCU_TABLE_FREE. It seems to me that patch 10 is kind
of half doing that by making this one path always use RCU on all
arches.

AFAIK the main reason it hasn't been done was the lack of a rcu_head..

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ