lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zd82FqN7qxuBUSvl@casper.infradead.org>
Date: Wed, 28 Feb 2024 13:33:10 +0000
From: Matthew Wilcox <willy@...radead.org>
To: Ryan Roberts <ryan.roberts@....com>
Cc: David Hildenbrand <david@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Huang Ying <ying.huang@...el.com>, Gao Xiang <xiang@...nel.org>,
	Yu Zhao <yuzhao@...gle.com>, Yang Shi <shy828301@...il.com>,
	Michal Hocko <mhocko@...e.com>,
	Kefeng Wang <wangkefeng.wang@...wei.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v3 1/4] mm: swap: Remove CLUSTER_FLAG_HUGE from
 swap_cluster_info:flags

On Wed, Feb 28, 2024 at 09:37:06AM +0000, Ryan Roberts wrote:
> Fundamentally, we would like to be able to figure out the size of the swap slot
> from the swap entry. Today swap supports 2 sizes; PAGE_SIZE and PMD_SIZE. For
> PMD_SIZE, it always uses a full cluster, so can easily add a flag to the cluster
> to mark it as PMD_SIZE.
> 
> Going forwards, we want to support all sizes (power-of-2). Most of the time, a
> cluster will contain only one size of THPs, but this is not the case when a THP
> in the swapcache gets split or when an order-0 slot gets stolen. We expect these
> cases to be rare.
> 
> 1) Keep the size of the smallest swap entry in the cluster header. Most of the
> time it will be the full size of the swap entry, but sometimes it will cover
> only a portion. In the latter case you may see a false negative for
> swap_page_trans_huge_swapped() meaning we take the slow path, but that is rare.
> There is one wrinkle: currently the HUGE flag is cleared in put_swap_folio(). We
> wouldn't want to do the equivalent in the new scheme (i.e. set the whole cluster
> to order-0). I think that is safe, but haven't completely convinced myself yet.
> 
> 2) allocate 4 bits per (small) swap slot to hold the order. This will give
> precise information and is conceptually simpler to understand, but will cost
> more memory (half as much as the initial swap_map[] again).
> 
> I still prefer to avoid this at all if we can (and would like to hear Huang's
> thoughts). But if its a choice between 1 and 2, I prefer 1 - I'll do some
> prototyping.

I can't quite bring myself to look up the encoding of swap entries
but as long as we're willing to restrict ourselves to naturally aligning
the clusters, there's an encoding (which I believe I invented) that lets
us encode arbitrary power-of-two sizes with a single bit.

I describe it here:
https://kernelnewbies.org/MatthewWilcox/NaturallyAlignedOrder

Let me know if it's not clear.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ