lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 10 Aug 2023 16:49:44 -0400
From:   Peter Xu <peterx@...hat.com>
To:     linux-mm@...ck.org, linux-kernel@...r.kernel.org
Cc:     Yu Zhao <yuzhao@...gle.com>, peterx@...hat.com,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Yang Shi <shy828301@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "Kirill A . Shutemov" <kirill@...temov.name>,
        Hugh Dickins <hughd@...gle.com>,
        David Hildenbrand <david@...hat.com>,
        Ryan Roberts <ryan.roberts@....com>,
        Matthew Wilcox <willy@...radead.org>
Subject: [PATCH RFC] mm: Properly document tail pages for compound pages

Tail page struct reuse is over-comlicated.  Not only because we have
implicit uses of tail page fields (mapcounts, or private for thp swap
support, etc., that we _may_ still use in the page structs, but not obvious
the relationship between that and the folio definitions), but also because
we have 32/64 bits layouts for struct page so it's unclear what we can use
and what we cannot when trying to find a new spot in folio struct.

We also have tricks like page->mapping, where we can reuse only the tail
page 1/2 but nothing more than tail page 2.  It is all mostly hidden, until
someone starts to read into a VM_BUG_ON_PAGE() of __split_huge_page_tail().

Let's document it clearly on what we can use and what we can't, with 100%
explanations on each of them.  Hopefully this will make:

  (1) Any reader to know exactly what field is where and for what, the
      relationships between folio tail pages and struct page definitions,

  (2) Any potential new fields to be added to a large folio, so we're clear
      which field one can still reuse (look for _reserved* ones).

This is assuming WORD is defined as sizeof(void *) on any archs, just like
the other comment in struct page we already have.

One pitfall is I'll need to split part of the tail page 1 definition into
32/64 bits differently, that introduced some duplications on the fields.
But hopefully that's worthwhile as it makes everything crystal clear.  Not
to mention that "pitfall" also brings a benefit that we can actually define
fields in different order for 32/64 bits when we want.

Signed-off-by: Peter Xu <peterx@...hat.com>
---
 include/linux/mm_types.h | 76 +++++++++++++++++++++++++++++++++++-----
 1 file changed, 67 insertions(+), 9 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 291c05cacd48..3e40e1b9fec3 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -313,41 +313,99 @@ struct folio {
 		};
 		struct page page;
 	};
+	/*
+	 * Some of the tail page fields (out of 8 WORDs for either 32/64
+	 * bits archs) may not be reused by the folio object because
+	 * they're already been used by the page struct:
+	 *
+	 * |-------+---------------|
+	 * | Index | Field         |
+	 * |-------+---------------|
+	 * |     0 | flag          |
+	 * |     1 | compound_head |
+	 * |     2 | N/A [0]       |
+	 * |     3 | mapping [1]   |
+	 * |     4 | N/A [0]       |
+	 * |     5 | private [2]   |
+	 * |     6 | mapcount      |
+	 * |     7 | N/A [0]       |
+	 * |-------+---------------|
+	 *
+	 * [0] "N/A" marks fields that are available to leverage for the
+	 *     large folio.
+	 *
+	 * [1] "mapping" field is only used for sanity check, see
+	 *     TAIL_MAPPING.  Still valid to use for tail pages 1/2.
+	 *     (for that, see __split_huge_page_tail()).
+	 *
+	 * [2] "private" field is used when THP_SWAP is on (disabled on 32
+	 *     bits, or on hugetlb folios) .
+	 */
 	union {
 		struct {
+	/* WORD 0-1: not valid to reuse */
 			unsigned long _flags_1;
 			unsigned long _head_1;
-	/* public: */
+	/* WORD 2 */
 			unsigned char _folio_dtor;
 			unsigned char _folio_order;
+			unsigned char _holes_1[2];
+#ifdef CONFIG_64BIT
 			atomic_t _entire_mapcount;
+	/* WORD 3 */
 			atomic_t _nr_pages_mapped;
 			atomic_t _pincount;
-#ifdef CONFIG_64BIT
+	/* WORD 4 */
 			unsigned int _folio_nr_pages;
+			unsigned int _reserved_1_1;
+	/* WORD 5-6: not valid to reuse */
+			unsigned long _used_1_2[2];
+	/* WORD 7 */
+			unsigned long _reserved_1_2;
+#else
+	/* WORD 3 */
+			atomic_t _entire_mapcount;
+	/* WORD 4 */
+			atomic_t _nr_pages_mapped;
+	/* WORD 5: only valid for 32bits */
+			atomic_t _pincount;
+	/* WORD 6: not valid to reuse */
+			unsigned long _used_1_2;
+	/* WORD 7 */
+			unsigned long _reserved_1;
 #endif
-	/* private: the union with struct page is transitional */
 		};
+	/* private: the union with struct page is transitional */
 		struct page __page_1;
 	};
 	union {
 		struct {
+	/* WORD 0-1: not valid to reuse */
 			unsigned long _flags_2;
 			unsigned long _head_2;
-	/* public: */
+	/* WORD 2-5 */
 			void *_hugetlb_subpool;
 			void *_hugetlb_cgroup;
 			void *_hugetlb_cgroup_rsvd;
 			void *_hugetlb_hwpoison;
-	/* private: the union with struct page is transitional */
+	/* WORD 6: not valid to reuse */
+			unsigned long _used_2_2;
+	/* WORD 7: */
+			unsigned long _reserved_2_1;
 		};
 		struct {
-			unsigned long _flags_2a;
-			unsigned long _head_2a;
-	/* public: */
+	/* WORD 0-1: not valid to reuse */
+			unsigned long _used_2_3[2];
+	/* WORD 2-3: */
 			struct list_head _deferred_list;
-	/* private: the union with struct page is transitional */
+	/* WORD 4: */
+			unsigned long _reserved_2_2;
+	/* WORD 5-6: not valid to reuse */
+			unsigned long _used_2_4[2];
+	/* WORD 7: */
+			unsigned long _reserved_2_3;
 		};
+	/* private: the union with struct page is transitional */
 		struct page __page_2;
 	};
 };
-- 
2.41.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ