linux-kernel - Re: [PATCH V3] riscv: asid: Fixup stale TLB entry cause application crash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <mhng-1d55338a-53a1-42eb-bf5c-91655fde2734@palmer-ri-x1c9a>
Date:   Thu, 08 Dec 2022 15:30:08 -0800 (PST)
From:   Palmer Dabbelt <palmer@...osinc.com>
To:     geomatsi@...il.com
CC:     guoren@...nel.org, anup@...infault.org,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Conor Dooley <conor.dooley@...rochip.com>, heiko@...ech.de,
        philipp.tomsich@...ll.eu, alex@...ti.fr,
        Christoph Hellwig <hch@....de>, ajones@...tanamicro.com,
        gary@...yguo.net, jszhang@...nel.org,
        linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org,
        guoren@...ux.alibaba.com, apatel@...tanamicro.com
Subject:     Re: [PATCH V3] riscv: asid: Fixup stale TLB entry cause application crash

On Fri, 18 Nov 2022 12:57:21 PST (-0800), geomatsi@...il.com wrote:
> Hi Guo Ren,
>
>
>> After use_asid_allocator is enabled, the userspace application will
>> crash by stale TLB entries. Because only using cpumask_clear_cpu without
>> local_flush_tlb_all couldn't guarantee CPU's TLB entries were fresh.
>> Then set_mm_asid would cause the user space application to get a stale
>> value by stale TLB entry, but set_mm_noasid is okay.
>
> ... [snip]
>
>> +	/*
>> +	 * The mm_cpumask indicates which harts' TLBs contain the virtual
>> +	 * address mapping of the mm. Compared to noasid, using asid
>> +	 * can't guarantee that stale TLB entries are invalidated because
>> +	 * the asid mechanism wouldn't flush TLB for every switch_mm for
>> +	 * performance. So when using asid, keep all CPUs footmarks in
>> +	 * cpumask() until mm reset.
>> +	 */
>> +	cpumask_set_cpu(cpu, mm_cpumask(next));
>> +	if (static_branch_unlikely(&use_asid_allocator)) {
>> +		set_mm_asid(next, cpu);
>> +	} else {
>> +		cpumask_clear_cpu(cpu, mm_cpumask(prev));
>> +		set_mm_noasid(next);
>> +	}
>>  }
>
> I observe similar user-space crashes on my SMP systems with enabled ASID.
> My attempt to fix the issue was a bit different, see the following patch:
>
> https://lore.kernel.org/linux-riscv/20220829205219.283543-1-geomatsi@gmail.com/
>
> In brief, the idea was borrowed from flush_icache_mm handling:
> - keep track of CPUs not running the task
> - perform per-ASID TLB flush on such CPUs only if the task is switched there

That way looks better to me: leaking hartids in the ASID allocator might 
make the crashes go away, but it's just going to end up trending towards 
flushing everything and that doesn't seem like the right long-term 
solution.

So I've got that one on for-next, sorry I missed it before.

Thanks!

>
> Your patch also works fine in my tests fixing those crashes. I have a
> question though, regarding removed cpumask_clear_cpu. How CPUs no more
> running the task are removed from its mm_cpumask ? If they are not
> removed, then flush_tlb_mm/flush_tlb_page will broadcast unnecessary
> TLB flushes to those CPUs when ASID is enabled.
>
> Regards,
> Sergey