lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <7D01FB21-5182-428D-BCD8-89C679ABEEC8@linux.dev>
Date:   Thu, 31 Aug 2023 11:27:52 +0800
From:   Muchun Song <muchun.song@...ux.dev>
To:     Mike Kravetz <mike.kravetz@...cle.com>
Cc:     Linux-MM <linux-mm@...ck.org>, LKML <linux-kernel@...r.kernel.org>,
        Muchun Song <songmuchun@...edance.com>,
        Joao Martins <joao.m.martins@...cle.com>,
        Oscar Salvador <osalvador@...e.de>,
        David Hildenbrand <david@...hat.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        David Rientjes <rientjes@...gle.com>,
        Anshuman Khandual <anshuman.khandual@....com>,
        Naoya Horiguchi <naoya.horiguchi@...ux.dev>,
        Michal Hocko <mhocko@...e.com>,
        Matthew Wilcox <willy@...radead.org>,
        Xiongchun Duan <duanxiongchun@...edance.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 09/12] hugetlb_vmemmap: Optimistically set Optimized flag



> On Aug 31, 2023, at 06:47, Mike Kravetz <mike.kravetz@...cle.com> wrote:
> 
> On 08/30/23 15:26, Muchun Song wrote:
>> 
>> 
>> On 2023/8/26 03:04, Mike Kravetz wrote:
>>> At the beginning of hugetlb_vmemmap_optimize, optimistically set
>>> the HPageVmemmapOptimized flag in the head page.  Clear the flag
>>> if the operation fails.
>>> 
>>> No change in behavior.  However, this will become important in
>>> subsequent patches where we batch delay TLB flushing.  We need to
>>> make sure the content in the old and new vmemmap pages are the same.
>> 
>> Sorry, I didn't get the point here. Could you elaborate it?
>> 
> 
> Sorry, this really could use a better explanation.
> 
>>> 
>>> Signed-off-by: Mike Kravetz <mike.kravetz@...cle.com>
>>> ---
>>>  mm/hugetlb_vmemmap.c | 8 +++++---
>>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
>>> index e390170c0887..500a118915ff 100644
>>> --- a/mm/hugetlb_vmemmap.c
>>> +++ b/mm/hugetlb_vmemmap.c
>>> @@ -566,7 +566,9 @@ static void __hugetlb_vmemmap_optimize(const struct hstate *h,
>>>   if (!vmemmap_should_optimize(h, head))
>>>   return;
>>> + /* Optimistically assume success */
>>>   static_branch_inc(&hugetlb_optimize_vmemmap_key);
>>> + SetHPageVmemmapOptimized(head);
>>>   vmemmap_end = vmemmap_start + hugetlb_vmemmap_size(h);
>>>   vmemmap_reuse = vmemmap_start;
>>> @@ -577,10 +579,10 @@ static void __hugetlb_vmemmap_optimize(const struct hstate *h,
>>>    * to the page which @vmemmap_reuse is mapped to, then free the pages
>>>    * which the range [@vmemmap_start, @vmemmap_end] is mapped to.
>>>    */
>>> - if (vmemmap_remap_free(vmemmap_start, vmemmap_end, vmemmap_reuse, bulk_pages))
>>> + if (vmemmap_remap_free(vmemmap_start, vmemmap_end, vmemmap_reuse, bulk_pages)) {
>>>   static_branch_dec(&hugetlb_optimize_vmemmap_key);
>>> - else
>>> - SetHPageVmemmapOptimized(head);
>>> + ClearHPageVmemmapOptimized(head);
>>> + }
> 
> Consider the case where we have successfully remapped vmemmap AND
> - we have replaced the page table page (pte page) containing the struct
>  page of the hugetlb head page.  Joao's commit 11aad2631bf7
>  'mm/hugetlb_vmemmap: remap head page to newly allocated page'.
> - we have NOT flushed the TLB after remapping due to batching the
>  operations before flush.
> 
> In this case, it is possible that the old head page is still in the TLB
> and caches and SetHPageVmemmapOptimized(head) will actually set the flag
> in the old pte page.  We then have an optimized hugetlb page without the
> HPageVmemmapOptimized flag set.  When developing this series, we
> experienced various BUGs as a result of this situation.

Now, I got it. Thanks for your elaboration.

> 
> In the case of an error during optimization, we do a TLB flush so if
> we need to clear the flag we will write to the correct pte page.

Right.

> 
> Hope that makes sense.
> 
> I add an explanation like this to the commit message and perhaps put
> this closer to/or squash with the patch that batches operations before
> flushing TLB.

Yes. But I'd also like to add a big comment to explain what's going on here
instead of a simple "Optimistically assume success". This one really makes
me think it is an optimization not a mandatory premise.

Thanks.

> -- 
> Mike Kravetz
> 
>>>  }
>>>  /**


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ