lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9e764222-a274-0a99-5e41-7cfa9ea15b86@redhat.com>
Date:   Thu, 17 Dec 2020 13:47:57 +0100
From:   David Hildenbrand <david@...hat.com>
To:     "Matthew Wilcox (Oracle)" <willy@...radead.org>,
        linux-fsdevel@...r.kernel.org, linux-mm@...ck.org
Cc:     linux-kernel@...r.kernel.org
Subject: Re: [PATCH 00/25] Page folios

On 16.12.20 19:23, Matthew Wilcox (Oracle) wrote:
> One of the great things about compound pages is that when you try to
> do various operations on a tail page, it redirects to the head page and
> everything Just Works.  One of the awful things is how much we pay for
> that simplicity.  Here's an example, end_page_writeback():
> 
>         if (PageReclaim(page)) {
>                 ClearPageReclaim(page);
>                 rotate_reclaimable_page(page);
>         }
>         get_page(page);
>         if (!test_clear_page_writeback(page))
>                 BUG();
> 
>         smp_mb__after_atomic();
>         wake_up_page(page, PG_writeback);
>         put_page(page);
> 
> That all looks very straightforward, but if you dive into the disassembly,
> you see that there are four calls to compound_head() in this function
> (PageReclaim(), ClearPageReclaim(), get_page() and put_page()).  It's
> all for nothing, because if anyone does call this routine with a tail
> page, wake_up_page() will VM_BUG_ON_PGFLAGS(PageTail(page), page).
> 
> I'm not really a CPU person, but I imagine there's some kind of dependency
> here that sucks too:
> 
>     1fd7:       48 8b 57 08             mov    0x8(%rdi),%rdx
>     1fdb:       48 8d 42 ff             lea    -0x1(%rdx),%rax
>     1fdf:       83 e2 01                and    $0x1,%edx
>     1fe2:       48 0f 44 c7             cmove  %rdi,%rax
>     1fe6:       f0 80 60 02 fb          lock andb $0xfb,0x2(%rax)
> 
> Sure, it's going to be cache hot, but that cmove has to execute before
> the lock andb.
> 
> I would like to introduce a new concept that I call a Page Folio.
> Or just struct folio to its friends.  Here it is,
> struct folio {
>         struct page page;
> };
> 
> A folio is a struct page which is guaranteed not to be a tail page.
> So it's either a head page or a base (order-0) page.  That means
> we don't have to call compound_head() on it and we save massively.
> end_page_writeback() reduces from four calls to compound_head() to just
> one (at the beginning of the function) and it shrinks from 213 bytes
> to 126 bytes (using distro kernel config options).  I think even that one
> can be eliminated, but I'm going slowly at this point and taking the
> safe route of transforming a random struct page pointer into a struct
> folio pointer by calling page_folio().  By the end of this exercise,
> end_page_writeback() will become end_folio_writeback().
> 
> This is going to be a ton of work, and massively disruptive.  It'll touch
> every filesystem, and a good few device drivers!  But I think it's worth
> it.  Not every routine benefits as much as end_page_writeback(), but it
> makes everything a little better.  At 29 bytes per call to lock_page(),
> unlock_page(), put_page() and get_page(), that's on the order of 60kB of
> text for allyesconfig.  More when you add on all the PageFoo() calls.
> With the small amount of work I've done here, mm/filemap.o shrinks its
> text segment by over a kilobyte from 33687 to 32318 bytes (and also 192
> bytes of data).

Just wondering, as the primary motivation here is "minimizing CPU work",
did you run any benchmarks that revealed a visible performance improvement?

Otherwise, we're left with a concept that's hard to grasp first (folio -
what?!) and "a ton of work, and massively disruptive", saving some kb of
code - which does not sound too appealing to me.

(I like the idea of abstracting which pages are actually worth looking
at directly instead of going via a tail page - tail pages act somewhat
like a proxy for the head page when accessing flags)

-- 
Thanks,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ