lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <63a0364b-a2e0-48c2-b255-e976112deeb1@redhat.com>
Date: Fri, 12 Jul 2024 15:39:20 +1000
From: Gavin Shan <gshan@...hat.com>
To: David Hildenbrand <david@...hat.com>, Matthew Wilcox <willy@...radead.org>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
 akpm@...ux-foundation.org, william.kucharski@...cle.com,
 ryan.roberts@....com, shan.gavin@...il.com
Subject: Re: [PATCH] mm/huge_memory: Avoid PMD-size page cache if needed

On 7/12/24 7:03 AM, David Hildenbrand wrote:
> On 11.07.24 22:46, Matthew Wilcox wrote:
>> On Thu, Jul 11, 2024 at 08:48:40PM +1000, Gavin Shan wrote:
>>> +++ b/mm/huge_memory.c
>>> @@ -136,7 +136,8 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
>>>           while (orders) {
>>>               addr = vma->vm_end - (PAGE_SIZE << order);
>>> -            if (thp_vma_suitable_order(vma, addr, order))
>>> +            if (!(vma->vm_file && order > MAX_PAGECACHE_ORDER) &&
>>> +                thp_vma_suitable_order(vma, addr, order))
>>>                   break;
>>
>> Why does 'orders' even contain potential orders that are larger than
>> MAX_PAGECACHE_ORDER?
>>
>> We do this at the top:
>>
>>          orders &= vma_is_anonymous(vma) ?
>>                          THP_ORDERS_ALL_ANON : THP_ORDERS_ALL_FILE;
>>
>> include/linux/huge_mm.h:#define THP_ORDERS_ALL_FILE     (BIT(PMD_ORDER) | BIT(PUD_ORDER))
>>
>> ... and that seems very wrong.  We support all kinds of orders for
>> files, not just PMD order.  We don't support PUD order at all.
>>
>> What the hell is going on here?
> 
> yes, that's just absolutely confusing. I mentioned it to Ryan lately that we should clean that up (I wanted to look into that, but am happy if someone else can help).
> 
> There should likely be different defines for
> 
> DAX (PMD|PUD)
> 
> SHMEM (PMD) -- but soon more. Not sure if we want separate ANON_SHMEM for the time being. Hm. But shmem is already handles separately, so maybe we can just ignore shmem here.
> 
> PAGECACHE (1 .. MAX_PAGECACHE_ORDER)
> 
> ? But it's still unclear to me.
> 
> At least DAX must stay special I think, and PAGECACHE should be capped at MAX_PAGECACHE_ORDER.
> 

David, I can help to clean it up. Could you please help to confirm the following
changes are exactly what you're suggesting? Hopefully, there are nothing I've missed.
The original issue can be fixed by the changes. With the changes applied, madvise(MADV_COLLAPSE)
returns with errno -22 in the test program.

The fix tag needs to adjusted either.

Fixes: 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface")

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 2aa986a5cd1b..45909efb0ef0 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -74,7 +74,12 @@ extern struct kobj_attribute shmem_enabled_attr;
  /*
   * Mask of all large folio orders supported for file THP.
   */
-#define THP_ORDERS_ALL_FILE    (BIT(PMD_ORDER) | BIT(PUD_ORDER))
+#define THP_ORDERS_ALL_FILE_DAX                \
+       ((BIT(PMD_ORDER) | BIT(PUD_ORDER)) & (BIT(MAX_PAGECACHE_ORDER + 1) - 1))
+#define THP_ORDERS_ALL_FILE_DEFAULT    \
+       ((BIT(MAX_PAGECACHE_ORDER + 1) - 1) & ~BIT(0))
+#define THP_ORDERS_ALL_FILE            \
+       (THP_ORDERS_ALL_FILE_DAX | THP_ORDERS_ALL_FILE_DEFAULT)
  
  /*
   * Mask of all large folio orders supported for THP.
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 2120f7478e55..4690f33afaa6 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -88,9 +88,17 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
         bool smaps = tva_flags & TVA_SMAPS;
         bool in_pf = tva_flags & TVA_IN_PF;
         bool enforce_sysfs = tva_flags & TVA_ENFORCE_SYSFS;
+       unsigned long supported_orders;
+
         /* Check the intersection of requested and supported orders. */
-       orders &= vma_is_anonymous(vma) ?
-                       THP_ORDERS_ALL_ANON : THP_ORDERS_ALL_FILE;
+       if (vma_is_anonymous(vma))
+               supported_orders = THP_ORDERS_ALL_ANON;
+       else if (vma_is_dax(vma))
+               supported_orders = THP_ORDERS_ALL_FILE_DAX;
+       else
+               supported_orders = THP_ORDERS_ALL_FILE_DEFAULT;
+
+       orders &= supported_orders;
         if (!orders)
                 return 0;

Thanks,
Gavin


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ