linux-kernel - Re: [PATCH v14 04/13] x86/mm: use INVLPGB for kernel TLB flushes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4c2144a4-8889-483b-bb16-4d361d1d3d90@intel.com>
Date: Fri, 28 Feb 2025 11:00:52 -0800
From: Dave Hansen <dave.hansen@...el.com>
To: Rik van Riel <riel@...riel.com>, x86@...nel.org
Cc: linux-kernel@...r.kernel.org, bp@...en8.de, peterz@...radead.org,
 dave.hansen@...ux.intel.com, zhengqi.arch@...edance.com,
 nadav.amit@...il.com, thomas.lendacky@....com, kernel-team@...a.com,
 linux-mm@...ck.org, akpm@...ux-foundation.org, jackmanb@...gle.com,
 jannh@...gle.com, mhklinux@...look.com, andrew.cooper3@...rix.com,
 Manali.Shukla@....com, mingo@...nel.org
Subject: Re: [PATCH v14 04/13] x86/mm: use INVLPGB for kernel TLB flushes

On 2/25/25 19:00, Rik van Riel wrote:
> Use broadcast TLB invalidation for kernel addresses when available.
> 
> Remove the need to send IPIs for kernel TLB flushes.

Nit: the changelog doesn't address the refactoring.

*Ideally*, you'd create the helpers and move the code there in one patch
and then actually "use INVLPGB for kernel TLB flushes" in the next. It's
compact enough here that it's not a deal breaker.

> +static void invlpgb_kernel_range_flush(struct flush_tlb_info *info)
> +{
> +	unsigned long addr, nr;
> +
> +	for (addr = info->start; addr < info->end; addr += nr << PAGE_SHIFT) {
> +		nr = (info->end - addr) >> PAGE_SHIFT;
> +		nr = clamp_val(nr, 1, invlpgb_count_max);
> +		invlpgb_flush_addr_nosync(addr, nr);
> +	}
> +	__tlbsync();
> +}

This needs a comment or two. Explaining that the function can take large
sizes:

/*
 * Flush an arbitrarily large range of memory with INVLPGB
 */

But that the _instruction_ can not is important.  This would be great in
the loop just above the clamp:

		/*
		 * INVLPGB has a limit on the size of ranges
		 * it can flush. Break large flushes up.
		 */

>  static void do_kernel_range_flush(void *info)
>  {
>  	struct flush_tlb_info *f = info;
> @@ -1087,6 +1099,22 @@ static void do_kernel_range_flush(void *info)
>  		flush_tlb_one_kernel(addr);
>  }
>  
> +static void kernel_tlb_flush_all(struct flush_tlb_info *info)
> +{
> +	if (cpu_feature_enabled(X86_FEATURE_INVLPGB))
> +		invlpgb_flush_all();
> +	else
> +		on_each_cpu(do_flush_tlb_all, NULL, 1);
> +}
> +
> +static void kernel_tlb_flush_range(struct flush_tlb_info *info)
> +{
> +	if (cpu_feature_enabled(X86_FEATURE_INVLPGB))
> +		invlpgb_kernel_range_flush(info);
> +	else
> +		on_each_cpu(do_kernel_range_flush, info, 1);
> +}
> +
>  void flush_tlb_kernel_range(unsigned long start, unsigned long end)
>  {
>  	struct flush_tlb_info *info;
> @@ -1097,9 +1125,9 @@ void flush_tlb_kernel_range(unsigned long start, unsigned long end)
>  				  TLB_GENERATION_INVALID);
>  
>  	if (info->end == TLB_FLUSH_ALL)
> -		on_each_cpu(do_flush_tlb_all, NULL, 1);
> +		kernel_tlb_flush_all(info);
>  	else
> -		on_each_cpu(do_kernel_range_flush, info, 1);
> +		kernel_tlb_flush_range(info);
>  
>  	put_flush_tlb_info();
>  }

But the structure of this code is much better than previous versions.
With the comments fixed:

Acked-by: Dave Hansen <dave.hansen@...el.com>