linux-kernel - Re: [PATCH v2 2/3] LoongArch: Add barrier between set

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e7c06bf4-897a-7060-61f9-97435d2af16e@loongson.cn>
Date: Tue, 15 Oct 2024 10:53:35 +0800
From: maobibo <maobibo@...ngson.cn>
To: Huacai Chen <chenhuacai@...nel.org>
Cc: Andrey Ryabinin <ryabinin.a.a@...il.com>,
 Andrew Morton <akpm@...ux-foundation.org>,
 David Hildenbrand <david@...hat.com>, Barry Song <baohua@...nel.org>,
 loongarch@...ts.linux.dev, linux-kernel@...r.kernel.org,
 kasan-dev@...glegroups.com, linux-mm@...ck.org
Subject: Re: [PATCH v2 2/3] LoongArch: Add barrier between set_pte and memory
 access



On 2024/10/14 下午2:31, Huacai Chen wrote:
> Hi, Bibo,
> 
> On Mon, Oct 14, 2024 at 11:59 AM Bibo Mao <maobibo@...ngson.cn> wrote:
>>
>> It is possible to return a spurious fault if memory is accessed
>> right after the pte is set. For user address space, pte is set
>> in kernel space and memory is accessed in user space, there is
>> long time for synchronization, no barrier needed. However for
>> kernel address space, it is possible that memory is accessed
>> right after the pte is set.
>>
>> Here flush_cache_vmap/flush_cache_vmap_early is used for
>> synchronization.
>>
>> Signed-off-by: Bibo Mao <maobibo@...ngson.cn>
>> ---
>>   arch/loongarch/include/asm/cacheflush.h | 14 +++++++++++++-
>>   1 file changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/loongarch/include/asm/cacheflush.h b/arch/loongarch/include/asm/cacheflush.h
>> index f8754d08a31a..53be231319ef 100644
>> --- a/arch/loongarch/include/asm/cacheflush.h
>> +++ b/arch/loongarch/include/asm/cacheflush.h
>> @@ -42,12 +42,24 @@ void local_flush_icache_range(unsigned long start, unsigned long end);
>>   #define flush_cache_dup_mm(mm)                         do { } while (0)
>>   #define flush_cache_range(vma, start, end)             do { } while (0)
>>   #define flush_cache_page(vma, vmaddr, pfn)             do { } while (0)
>> -#define flush_cache_vmap(start, end)                   do { } while (0)
>>   #define flush_cache_vunmap(start, end)                 do { } while (0)
>>   #define flush_icache_user_page(vma, page, addr, len)   do { } while (0)
>>   #define flush_dcache_mmap_lock(mapping)                        do { } while (0)
>>   #define flush_dcache_mmap_unlock(mapping)              do { } while (0)
>>
>> +/*
>> + * It is possible for a kernel virtual mapping access to return a spurious
>> + * fault if it's accessed right after the pte is set. The page fault handler
>> + * does not expect this type of fault. flush_cache_vmap is not exactly the
>> + * right place to put this, but it seems to work well enough.
>> + */
>> +static inline void flush_cache_vmap(unsigned long start, unsigned long end)
>> +{
>> +       smp_mb();
>> +}
>> +#define flush_cache_vmap flush_cache_vmap
>> +#define flush_cache_vmap_early flush_cache_vmap
>  From the history of flush_cache_vmap_early(), It seems only archs with
> "virtual cache" (VIVT or VIPT) need this API, so LoongArch can be a
> no-op here.

Here is usage about flush_cache_vmap_early in file linux/mm/percpu.c,
map the page and access it immediately. Do you think it should be noop 
on LoongArch.

rc = __pcpu_map_pages(unit_addr, &pages[unit * unit_pages],
                                      unit_pages);
if (rc < 0)
     panic("failed to map percpu area, err=%d\n", rc);
     flush_cache_vmap_early(unit_addr, unit_addr + ai->unit_size);
     /* copy static data */
     memcpy((void *)unit_addr, __per_cpu_load, ai->static_size);
}


> 
> And I still think flush_cache_vunmap() should be a smp_mb(). A
> smp_mb() in flush_cache_vmap() prevents subsequent accesses be
> reordered before pte_set(), and a smp_mb() in flush_cache_vunmap()
smp_mb() in flush_cache_vmap() does not prevent reorder. It is to flush 
pipeline and let page table walker HW sync with data cache.

For the following example.
   rb = vmap(pages, nr_meta_pages + 2 * nr_data_pages,
                   VM_MAP | VM_USERMAP, PAGE_KERNEL);
   if (rb) {
<<<<<<<<<<< * the sentence if (rb) can prevent reorder. Otherwise with 
any API kmalloc/vmap/vmalloc and subsequent memory access, there will be 
reorder issu. *
       kmemleak_not_leak(pages);
       rb->pages = pages;
       rb->nr_pages = nr_pages;
       return rb;
   }

> prevents preceding accesses be reordered after pte_clear(). This
Can you give an example about such usage about flush_cache_vunmap()? and 
we can continue to talk about it, else it is just guessing.

Regards
Bibo Mao
> potential problem may not be seen from experiment, but it is needed in
> theory.
> 
> Huacai
> 
>> +
>>   #define cache_op(op, addr)                                             \
>>          __asm__ __volatile__(                                           \
>>          "       cacop   %0, %1                                  \n"     \
>> --
>> 2.39.3
>>
>>