lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB5276FCF7D5182D711E135BB78C3BA@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Thu, 28 Aug 2025 07:08:00 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Baolu Lu <baolu.lu@...ux.intel.com>, "Hansen, Dave"
	<dave.hansen@...el.com>, Jason Gunthorpe <jgg@...dia.com>
CC: Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>, "Robin
 Murphy" <robin.murphy@....com>, Jann Horn <jannh@...gle.com>, Vasant Hegde
	<vasant.hegde@....com>, Alistair Popple <apopple@...dia.com>, Peter Zijlstra
	<peterz@...radead.org>, Uladzislau Rezki <urezki@...il.com>, "Jean-Philippe
 Brucker" <jean-philippe@...aro.org>, Andy Lutomirski <luto@...nel.org>, "Lai,
 Yi1" <yi1.lai@...el.com>, "iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
	"security@...nel.org" <security@...nel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "stable@...r.kernel.org"
	<stable@...r.kernel.org>, "vishal.moola@...il.com" <vishal.moola@...il.com>,
	Matthew Wilcox <willy@...radead.org>
Subject: RE: [PATCH v3 1/1] iommu/sva: Invalidate KVA range on kernel TLB
 flush

> From: Baolu Lu <baolu.lu@...ux.intel.com>
> Sent: Thursday, August 28, 2025 1:31 PM
>
> @@ -438,7 +438,10 @@ static void cpa_collapse_large_pages(struct
> cpa_data *cpa)
> 
>   	list_for_each_entry_safe(ptdesc, tmp, &pgtables, pt_list) {
>   		list_del(&ptdesc->pt_list);
> -		__free_page(ptdesc_page(ptdesc));
> +		if (IS_ENABLED(CONFIG_ASYNC_PGTABLE_FREE))
> +			kernel_pgtable_async_free(ptdesc);
> +		else
> +			__free_page(ptdesc_page(ptdesc));
>   	}

Dave's suggestion is to check the new ptdesc flag and defer in
pagetable_free().

both here and above could be converted to:

	ptdesc->__page_type |= PTDESC_TYPE_KERNEL;
	pagetable_free(ptdesc);

> @@ -757,7 +757,14 @@ int pud_free_pmd_page(pud_t *pud, unsigned long
> addr)
> 
>   	free_page((unsigned long)pmd_sv);
> 
> -	pmd_free(&init_mm, pmd);
> +	if (IS_ENABLED(CONFIG_ASYNC_PGTABLE_FREE)) {
> +		struct ptdesc *ptdesc = virt_to_ptdesc(pmd);
> +
> +		ptdesc->__page_type |= PTDESC_TYPE_KERNEL;
> +		kernel_pgtable_async_free(ptdesc);
> +	} else {
> +		pmd_free(&init_mm, pmd);
> +	}

We may add a new pmd_free_kernel() helper, which does:

	ptdesc->__page_type |= PTDESC_TYPE_KERNEL;
	pagetable_dtor_free(ptdesc);

>   static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
>   {
> -	pagetable_dtor_free(virt_to_ptdesc(pte));
> +	struct ptdesc *ptdesc = virt_to_ptdesc(pte);
> +
> +	ptdesc->__page_type |= PTDESC_TYPE_KERNEL;
> +
> +	if (IS_ENABLED(CONFIG_ASYNC_PGTABLE_FREE))
> +		kernel_pgtable_async_free(ptdesc);
> +	else
> +		pagetable_dtor_free(ptdesc);
>   }

same:

	ptdesc->__page_type |= PTDESC_TYPE_KERNEL;
	pagetable_dtor_free(ptdesc);

Then you have pagetable_free() to handle defer in one place (revised on
Dave's draft):

static inline void pagetable_free(struct ptdesc *pt)
{
	struct page *page = ptdesc_page(pt);

	if (IS_ENABLED(CONFIG_ASYNC_PGTABLE_FREE) &&
	    (ptdesc->__page_type | PTDESC_KERNEL))
		kernel_pgtable_async_free_page(page);
	else
		__free_pages(page, compound_order(page));
}

> +static void kernel_pgtable_work_func(struct work_struct *work)
> +{
> +	struct ptdesc *ptdesc, *next;
> +	LIST_HEAD(page_list);
> +
> +	spin_lock(&kernel_pgtable_work.lock);
> +	list_splice_tail_init(&kernel_pgtable_work.list, &page_list);
> +	spin_unlock(&kernel_pgtable_work.lock);
> +
> +	list_for_each_entry_safe(ptdesc, next, &page_list, pt_list) {
> +		list_del(&ptdesc->pt_list);
> +		if (ptdesc->__page_type & PTDESC_TYPE_KERNEL) {
> +			pagetable_dtor_free(ptdesc);
> +		} else {
> +			struct page *page = ptdesc_page(ptdesc);
> +
> +			__free_pages(page, compound_order(page));
> +		}

Then you only need __free_pages() here.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ