linux-kernel - Re: [PATCH] x86/mm/tlb: ignore f->new_tlb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c5edb95c-3ca3-9339-47d6-6304f9bfd708@intel.com>
Date:   Fri, 8 Jul 2022 07:49:55 -0700
From:   Dave Hansen <dave.hansen@...el.com>
To:     Nadav Amit <nadav.amit@...il.com>, linux-kernel@...r.kernel.org,
        Hugh Dickins <hughd@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>
Cc:     Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        x86@...nel.org, Linux MM <linux-mm@...ck.org>,
        Nadav Amit <namit@...are.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Andy Lutomirski <luto@...nel.org>
Subject: Re: [PATCH] x86/mm/tlb: ignore f->new_tlb_gen when zero

On 7/7/22 17:30, Nadav Amit wrote:

You might want to fix the clock on the system from which you sent this.
 I was really scratching my head trying to figure out how you got this
patch out before Hugh's bug report.

> From: Nadav Amit <namit@...are.com>
> 
> Commit aa44284960d5 ("x86/mm/tlb: Avoid reading mm_tlb_gen when
> possible") introduced an optimization of skipping the flush if the TLB
> generation that is flushed (as provided in flush_tlb_info) was already
> flushed.
> 
> However, arch_tlbbatch_flush() does not provide any generation in
> flush_tlb_info. As a result, try_to_unmap_one() would not perform any
> TLB flushes.
> 
> Fix it by checking whether f->new_tlb_gen is nonzero. Zero value is
> anyhow is an invalid generation value.

It is, but the check below uses 'f->end == TLB_FLUSH_ALL' as the marker
for f->new_tlb_gen being invalid.  Being consistent seems like a good
idea on this stuff.

> In addition, add the missing unlikely() and jump to get tracing right.

There are currently five routes out of flush_tlb_func():
 * Three early returns
 * One 'goto done'
 * One implicit return

The tracing code doesn't get run for any of the early returns, but
that's intentional because they don't *actually* flush the TLB.  We
don't want to record that flush_tlb_func() flushed the TLB when it
didn't.  There's another tracepoint on the TLB_REMOTE_SEND_IPI side to
tell where the flushes were requested.

That said, I think the

	if (unlikely(local_tlb_gen == mm_tlb_gen))
		goto done;

is arguably buggy, as is the 'goto done' in this hunk:

>  arch/x86/mm/tlb.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index d9314cc8b81f..d81b4084bb8a 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -771,14 +771,14 @@ static void flush_tlb_func(void *info)
>  		return;
>  	}
>  
> -	if (f->new_tlb_gen <= local_tlb_gen) {
> +	if (unlikely(f->new_tlb_gen != 0 && f->new_tlb_gen <= local_tlb_gen)) {
>  		/*
>  		 * The TLB is already up to date in respect to f->new_tlb_gen.
>  		 * While the core might be still behind mm_tlb_gen, checking
>  		 * mm_tlb_gen unnecessarily would have negative caching effects
>  		 * so avoid it.
>  		 */
> -		return;
> +		goto done;
>  	}
>  
>  	/*

We might want to (eventually) think about doing something like the
attached patch to make the skipped flushes explicit in the tracing and
make the return paths out of this function more consistent.
View attachment "tlbtrace.patch" of type "text/x-patch" (2408 bytes)