linux-kernel - Re: [PATCH v2 3/6] mm/zsmalloc: use a proper page type

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <345161ac-3b42-48aa-ab3d-3b183316479a@redhat.com>
Date: Fri, 31 May 2024 16:32:04 +0200
From: David Hildenbrand <david@...hat.com>
To: Matthew Wilcox <willy@...radead.org>,
 Sergey Senozhatsky <senozhatsky@...omium.org>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 Andrew Morton <akpm@...ux-foundation.org>, Mike Rapoport <rppt@...nel.org>,
 Minchan Kim <minchan@...nel.org>, Hyeonggon Yoo <42.hyeyoo@...il.com>
Subject: Re: [PATCH v2 3/6] mm/zsmalloc: use a proper page type

On 31.05.24 16:27, Matthew Wilcox wrote:
> On Thu, May 30, 2024 at 02:01:23PM +0900, Sergey Senozhatsky wrote:
>> On (24/05/29 13:19), David Hildenbrand wrote:
>>> We won't be able to support 256 KiB base pages, which is acceptable.
>> [..]
>>> +config HAVE_ZSMALLOC
>>> +	def_bool y
>>> +	depends on MMU
>>> +	depends on PAGE_SIZE_LESS_THAN_256KB # we want <= 64 KiB
>>
>> Can't really say that I'm happy with this, but if mm-folks are
>> fine then okay.
> 
> I have an idea ...
> 
> We use 6 of the bits in the top byte of the page_type to enumerate
> a type (ie value 0x80-0xbf) and then the remaining 24 bits are
> available.  It's actually more efficient:
> 
> $ ./scripts/bloat-o-meter prev.o .build-debian/mm/filemap.o
> add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-40 (-40)
> Function                                     old     new   delta
> __filemap_add_folio                         1102    1098      -4
> filemap_unaccount_folio                      455     446      -9
> replace_page_cache_folio                     474     447     -27
> Total: Before=41258, After=41218, chg -0.10%
> 
> (that's all from PG_hugetlb)
> 
> before:
>      1406:       8b 46 30                mov    0x30(%rsi),%eax
>                  mapcount = atomic_read(&folio->_mapcount) + 1;
>      1409:       83 c0 01                add    $0x1,%eax
>                  if (mapcount < PAGE_MAPCOUNT_RESERVE + 1)
>      140c:       83 f8 81                cmp    $0xffffff81,%eax
>      140f:       7d 6c                   jge    147d <filemap_unaccount_folio+0x8d>
>      1411:       8b 43 30                mov    0x30(%rbx),%eax
>      1414:       25 00 08 00 f0          and    $0xf0000800,%eax
>      1419:       3d 00 00 00 f0          cmp    $0xf0000000,%eax
>      141e:       74 4e                   je     146e <filemap_unaccount_folio+0x7e>
> 
> after:
>      1406:       8b 46 30                mov    0x30(%rsi),%eax
>                  mapcount = atomic_read(&folio->_mapcount) + 1;
>      1409:       83 c0 01                add    $0x1,%eax
>                  if (mapcount < PAGE_MAPCOUNT_RESERVE + 1)
>      140c:       83 f8 81                cmp    $0xffffff81,%eax
>      140f:       7d 63                   jge    1474 <filemap_unaccount_folio+0x8
> 4>
>          if (folio_test_hugetlb(folio))
>      1411:       80 7b 33 84             cmpb   $0x84,0x33(%rbx)
>      1415:       74 4e                   je     1465 <filemap_unaccount_folio+0x75>
> 
> so we go from "mov, and, cmp, je" to just "cmpb, je", which must surely
> be faster to execute as well as being more compact in the I$ (6 bytes vs 15).
> 
> Anyway, not tested but this is the patch I used to generate the above.
> More for comment than application.

Right, it's likely very similar to my previous proposal to use 8 bit 
(uint8_t) for the type.

https://lore.kernel.org/all/00ba1dff-7c05-46e8-b0d9-a78ac1cfc198@redhat.com/

I would prefer if we would do that separately; unless someone is able to 
raise why we care about zram + 256KiB that much right now. (claim: we don't)

> 
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 5265b3434b9e..4129d04ac812 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -942,24 +942,24 @@ PAGEFLAG_FALSE(HasHWPoisoned, has_hwpoisoned)
>    * mistaken for a page type value.
>    */
>   
> -#define PAGE_TYPE_BASE	0xf0000000
> -/* Reserve		0x0000007f to catch underflows of _mapcount */
> -#define PAGE_MAPCOUNT_RESERVE	-128
> -#define PG_buddy	0x00000080
> -#define PG_offline	0x00000100
> -#define PG_table	0x00000200
> -#define PG_guard	0x00000400
> -#define PG_hugetlb	0x00000800
> -#define PG_slab		0x00001000
> -
> -#define PageType(page, flag)						\
> -	((page->page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
> -#define folio_test_type(folio, flag)					\
> -	((folio->page.page_type & (PAGE_TYPE_BASE | flag)) == PAGE_TYPE_BASE)
> +/* Reserve             0x0000007f to catch underflows of _mapcount */
> +#define PAGE_MAPCOUNT_RESERVE  -128
> +
> +#define PG_buddy	0x80
> +#define PG_offline	0x81
> +#define PG_table	0x82
> +#define PG_guard	0x83
> +#define PG_hugetlb	0x84
> +#define PG_slab		0x85

Hoping we can stop calling that PG_ ...




-- 
Cheers,

David / dhildenb