lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a8ac43c8-d5e7-4d47-89fb-394db5994fe1@os.amperecomputing.com>
Date: Tue, 13 Aug 2024 10:08:22 -0700
From: Yang Shi <yang@...amperecomputing.com>
To: Catalin Marinas <catalin.marinas@....com>
Cc: muchun.song@...ux.dev, will@...nel.org, akpm@...ux-foundation.org,
 linux-mm@...ck.org, linux-arm-kernel@...ts.infradead.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH] hugetlbfs: add MTE support



On 7/9/24 10:42 AM, Yang Shi wrote:
>
>
> On 7/4/24 6:44 AM, Catalin Marinas wrote:
>> On Tue, Jul 02, 2024 at 05:15:12PM -0700, Yang Shi wrote:
>>> On 7/2/24 5:04 PM, Yang Shi wrote:
>>>> On 7/2/24 5:34 AM, Catalin Marinas wrote:
>>>>> Last time I checked, about a year ago, this was not sufficient. One
>>>>> issue is that there's no arch_clear_hugetlb_flags() implemented by 
>>>>> your
>>>>> patch, leaving PG_arch_{2,3} set on a page. The other issue was 
>>>>> that I
>>>>> initially tried to do this only on the head page but this did not go
>>>>> well with the folio_copy() -> copy_highpage() which expects the
>>>>> PG_arch_* flags on each individual page. The alternative was for
>>>>> arch_clear_hugetlb_flags() to iterate over all the pages in a folio.
>>>> Thanks for pointing this out. I did miss this point. I took a quick 
>>>> look
>>>> at when the PG_ flags are set. IIUC, it is set by post_alloc_hook() 
>>>> for
>>>> order-0 anonymous folio (clearing page and tags) and set_ptes() for
>>>> others (just clear tags), for example, THP and hugetlb.
>>>>
>>>> I can see THP does set the PG_mte_tagged flag for each sub pages. 
>>>> But it
>>>> seems it does not do it for hugetlb if I read the code correctly. The
>>>> call path is:
>>>>
>>>> hugetlb_fault() ->
>>>>    hugetlb_no_page->
>>>>      set_huge_pte_at ->
>>>>        __set_ptes() ->
>>>>          __sync_cache_and_tags() ->
>>>>
>>>>
>>>> The __set_ptes() is called in a loop:
>>>>
>>>> if (!pte_present(pte)) {
>>>>          for (i = 0; i < ncontig; i++, ptep++, addr += pgsize)
>>>>              __set_ptes(mm, addr, ptep, pte, 1);
>>>>          return;
>>>>      }
>>>>
>>>> The ncontig and pgsize are returned by num_contig_ptes(). For example,
>>>> 2M hugetlb, ncontig is 1 and pgsize is 2M IIUC. So it means actually
>>>> just the head page has PG_mte_tagged set. If so the copy_highpage() 
>>>> will
>>>> just copy the old head page's flag to the new head page, and the tag.
>>>> All the sub pages don't have PG_mte_tagged set.
>>>>
>>>> Is it expected behavior? I'm supposed we need tags for every sub pages
>>>> too, right?
>>> We should need something like the below to have tags and page flag 
>>> set up
>>> for each sub page:
>>>
>>> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
>>> index 3f09ac73cce3..528164deef27 100644
>>> --- a/arch/arm64/mm/hugetlbpage.c
>>> +++ b/arch/arm64/mm/hugetlbpage.c
>>> @@ -228,9 +228,12 @@ void set_huge_pte_at(struct mm_struct *mm, 
>>> unsigned
>>> long addr,
>>>          int ncontig;
>>>          unsigned long pfn, dpfn;
>>>          pgprot_t hugeprot;
>>> +       unsigned long nr = sz >> PAGE_SHIFT;
>>>
>>>          ncontig = num_contig_ptes(sz, &pgsize);
>>>
>>> +       __sync_cache_and_tags(pte, nr);
>>> +
>>>          if (!pte_present(pte)) {
>>>                  for (i = 0; i < ncontig; i++, ptep++, addr += pgsize)
>>>                          __set_ptes(mm, addr, ptep, pte, 1);
>> We only need this for the last loop when we have a present pte. I'd also
>> avoid calling __set_ptes(...., 1) if we call the __sync_cache_and_tags()
>> here already. Basically just unroll __set_ptes() in set_huge_pte_at()
>> and call __set_pte() directly.
>>
>> It might be better to convert those page flag checks to only happen on
>> the head page. My stashed changes from over a year ago (before we had
>> more folio conversions) below. However, as I mentioned, I got stuck on
>> folio_copy() which also does a cond_resched() between copy_highpage().
>

Ping...

Does this make sense and sound good? If is is good, I will try to 
implement it.

> We can have the page flags set for head only for hugetlb page. For 
> copy_highpage(), we should be able to do something like the below:
>
> if  page_is_head && page_is_hugetlb && page_has_mte_tagged
>     set page_mte_tagged flags
>     copy tags for all sub pages
> else // <-- tail page or non-hugetlb page
>     current copy_highpage implementation
>
>
> The hugetlb folio can't go away under us since migration path should 
> pin it so the status of folio is stable. The preemption caused by 
> cond_resched() should be fine too due to the pin and the page table 
> entry keeps being migration entry until migration is done, so every 
> one should just see migration entry and wait for migration is done.
>
> The other concerned user of copy_highpage() is uprobe, but it also 
> pins the page then doing copy and it is called with holding write 
> mmap_lock.
>
> IIUC, it should work if I don't miss something. This also should have 
> no impact on HVO. The overhead for other users of copy_highpage() 
> should be also acceptable.
>
>>
>> ---------8<--------------------------------
>> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
>> index f5bcb0dc6267..a84ad0e46f12 100644
>> --- a/arch/arm64/kernel/mte.c
>> +++ b/arch/arm64/kernel/mte.c
>> @@ -49,7 +49,9 @@ static void mte_sync_page_tags(struct page *page, 
>> pte_t old_pte,
>>           return;
>>         if (try_page_mte_tagging(page)) {
>> -        mte_clear_page_tags(page_address(page));
>> +        unsigned long i, nr_pages = compound_nr(page);
>> +        for (i = 0; i < nr_pages; i++)
>> +            mte_clear_page_tags(page_address(page + i));
>>           set_page_mte_tagged(page);
>>       }
>>   }
>> @@ -57,22 +59,23 @@ static void mte_sync_page_tags(struct page *page, 
>> pte_t old_pte,
>>   void mte_sync_tags(pte_t old_pte, pte_t pte)
>>   {
>>       struct page *page = pte_page(pte);
>> -    long i, nr_pages = compound_nr(page);
>> -    bool check_swap = nr_pages == 1;
>> +    bool check_swap = !PageCompound(page);
>>       bool pte_is_tagged = pte_tagged(pte);
>>         /* Early out if there's nothing to do */
>>       if (!check_swap && !pte_is_tagged)
>>           return;
>>   +    /*
>> +     * HugeTLB pages are always fully mapped, so only setting head 
>> page's
>> +     * PG_mte_* flags is enough.
>> +     */
>> +    if (PageHuge(page))
>> +        page = compound_head(page);
>> +
>>       /* if PG_mte_tagged is set, tags have already been initialised */
>> -    for (i = 0; i < nr_pages; i++, page++) {
>> -        if (!page_mte_tagged(page)) {
>> -            mte_sync_page_tags(page, old_pte, check_swap,
>> -                       pte_is_tagged);
>> -            set_page_mte_tagged(page);
>> -        }
>> -    }
>> +    if (!page_mte_tagged(page))
>> +        mte_sync_page_tags(page, old_pte, check_swap, pte_is_tagged);
>>         /* ensure the tags are visible before the PTE is set */
>>       smp_wmb();
>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> index 5626ddb540ce..692198d8c0d2 100644
>> --- a/arch/arm64/kvm/guest.c
>> +++ b/arch/arm64/kvm/guest.c
>> @@ -1079,7 +1079,7 @@ long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
>>                 /* uaccess failed, don't leave stale tags */
>>               if (num_tags != MTE_GRANULES_PER_PAGE)
>> -                mte_clear_page_tags(page);
>> +                mte_clear_page_tags(page_address(page));
>>               set_page_mte_tagged(page);
>>                 kvm_release_pfn_dirty(pfn);
>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>> index 31d7fa4c7c14..b531b1d75cda 100644
>> --- a/arch/arm64/kvm/mmu.c
>> +++ b/arch/arm64/kvm/mmu.c
>> @@ -1173,11 +1173,10 @@ static void sanitise_mte_tags(struct kvm 
>> *kvm, kvm_pfn_t pfn,
>>       if (!kvm_has_mte(kvm))
>>           return;
>>   -    for (i = 0; i < nr_pages; i++, page++) {
>> -        if (try_page_mte_tagging(page)) {
>> -            mte_clear_page_tags(page_address(page));
>> -            set_page_mte_tagged(page);
>> -        }
>> +    if (try_page_mte_tagging(page)) {
>> +        for (i = 0; i < nr_pages; i++)
>> +            mte_clear_page_tags(page_address(page + i));
>> +        set_page_mte_tagged(page);
>>       }
>>   }
>>
>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ