lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YSVMAS2pQVq+xma7@casper.infradead.org>
Date:   Tue, 24 Aug 2021 20:44:01 +0100
From:   Matthew Wilcox <willy@...radead.org>
To:     Johannes Weiner <hannes@...xchg.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>, linux-mm@...ck.org,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [GIT PULL] Memory folios for v5.15

On Tue, Aug 24, 2021 at 02:32:56PM -0400, Johannes Weiner wrote:
> The folio doc says "It is at least as large as %PAGE_SIZE";
> folio_order() says "A folio is composed of 2^order pages";
> page_folio(), folio_pfn(), folio_nr_pages all encode a N:1
> relationship. And yes, the name implies it too.
> 
> This is in direct conflict with what I'm talking about, where base
> page granularity could become coarser than file cache granularity.

That doesn't make any sense.  A page is the fundamental unit of the
mm.  Why would we want to increase the granularity of page allocation
and not increase the granularity of the file cache?

> Are we going to bump struct page to 2M soon? I don't know. Here is
> what I do know about 4k pages, though:
> 
> - It's a lot of transactional overhead to manage tens of gigs of
>   memory in 4k pages. We're reclaiming, paging and swapping more than
>   ever before in our DCs, because flash provides in abundance the
>   low-latency IOPS required for that, and parking cold/warm workload
>   memory on cheap flash saves expensive RAM. But we're continously
>   scanning thousands of pages per second to do this. There was also
>   the RWF_UNCACHED thread around reclaim CPU overhead at the higher
>   end of buffered IO rates. There is the fact that we have a pending
>   proposal from Google to replace rmap because it's too CPU-intense
>   when paging into compressed memory pools.

This seems like an argument for folios, not against them.  If user
memory (both anon and file) is being allocated in larger chunks, there
are fewer pages to scan, less book-keeping to do, and all you're paying
for that is I/O bandwidth.

> - It's a lot of internal fragmentation. Compaction is becoming the
>   default method for allocating the majority of memory in our
>   servers. This is a latency concern during page faults, and a
>   predictability concern when we defer it to khugepaged collapsing.

Again, the more memory that we allocate in higher-order chunks, the
better this situation becomes.

> - struct page is statically eating gigs of expensive memory on every
>   single machine, when only some of our workloads would require this
>   level of granularity for some of their memory. And that's *after*
>   we're fighting over every bit in that structure.

That, folios does not help with.  I have post-folio ideas about how
to address that, but I can't realistically start working on them
until folios are upstream.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ