lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210604030712.11b31259@linux.microsoft.com>
Date:   Fri, 4 Jun 2021 03:07:12 +0200
From:   Matteo Croce <mcroce@...ux.microsoft.com>
To:     "Matthew Wilcox (Oracle)" <willy@...radead.org>
Cc:     akpm@...ux-foundation.org, linux-fsdevel@...r.kernel.org,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v10 00/33] Memory folios

On Tue, 11 May 2021 22:47:02 +0100
"Matthew Wilcox (Oracle)" <willy@...radead.org> wrote:

> We also waste a lot of instructions ensuring that we're not looking at
> a tail page.  Almost every call to PageFoo() contains one or more
> hidden calls to compound_head().  This also happens for get_page(),
> put_page() and many more functions.  There does not appear to be a
> way to tell gcc that it can cache the result of compound_head(), nor
> is there a way to tell it that compound_head() is idempotent.
> 

Maybe it's not effective in all situations but the following hint to
the compiler seems to have an effect, at least according to bloat-o-meter:


--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -179,7 +179,7 @@ enum pageflags {
 
 struct page;   /* forward declaration */
 
-static inline struct page *compound_head(struct page *page)
+static inline __attribute_const__ struct page *compound_head(struct page *page)
 {
        unsigned long head = READ_ONCE(page->compound_head);
 

$ scripts/bloat-o-meter vmlinux.o.orig vmlinux.o
add/remove: 3/13 grow/shrink: 65/689 up/down: 21080/-198089 (-177009)
Function                                     old     new   delta
ntfs_mft_record_alloc                      14414   16627   +2213
migrate_pages                               8891   10819   +1928
ext2_get_page.isra                          1029    2343   +1314
kfence_init                                  180    1331   +1151
page_remove_rmap                             754    1893   +1139
f2fs_fsync_node_pages                       4378    5406   +1028
deferred_split_huge_page                    1279    2286   +1007
relock_page_lruvec_irqsave                     -     975    +975
f2fs_file_write_iter                        3508    4408    +900
__pagevec_lru_add                            704    1311    +607
[...]
pagevec_move_tail_fn                        5333    3215   -2118
__activate_page                             6183    4021   -2162
__unmap_and_move                            2190       -   -2190
__page_cache_release                        4738    2547   -2191
migrate_page_states                         7088    4842   -2246
lru_deactivate_fn                           5925    3652   -2273
move_pages_to_lru                           7259    4980   -2279
check_move_unevictable_pages                7131    4594   -2537
release_pages                               6940    4386   -2554
lru_lazyfree_fn                             6798    4198   -2600
ntfs_mft_record_format                      2940       -   -2940
lru_deactivate_file_fn                      9220    5631   -3589
shrink_page_list                           20653   15749   -4904
page_memcg                                  5149     193   -4956
Total: Before=388863526, After=388686517, chg -0.05%

I don't know if it breaks something though, nor if it gives some real
improvement.

-- 
per aspera ad upstream

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ