lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF8kJuNbUyDWcJ13ZLi-xsiYcbY30w7=cFs7wdxszkc7TC4K2Q@mail.gmail.com>
Date: Tue, 16 Sep 2025 15:42:07 -0700
From: Chris Li <chrisl@...nel.org>
To: Barry Song <21cnbao@...il.com>
Cc: Kairui Song <ryncsn@...il.com>, linux-mm@...ck.org, 
	Andrew Morton <akpm@...ux-foundation.org>, Matthew Wilcox <willy@...radead.org>, 
	Hugh Dickins <hughd@...gle.com>, Baoquan He <bhe@...hat.com>, Nhat Pham <nphamcs@...il.com>, 
	Kemeng Shi <shikemeng@...weicloud.com>, Baolin Wang <baolin.wang@...ux.alibaba.com>, 
	Ying Huang <ying.huang@...ux.alibaba.com>, Johannes Weiner <hannes@...xchg.org>, 
	David Hildenbrand <david@...hat.com>, Yosry Ahmed <yosryahmed@...gle.com>, 
	Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, Zi Yan <ziy@...dia.com>, 
	linux-kernel@...r.kernel.org, Kairui Song <kasong@...cent.com>
Subject: Re: [PATCH v4 01/15] docs/mm: add document for swap table

On Tue, Sep 16, 2025 at 3:00 PM Barry Song <21cnbao@...il.com> wrote:
>
> On Wed, Sep 17, 2025 at 12:01 AM Kairui Song <ryncsn@...il.com> wrote:
> >
> > From: Chris Li <chrisl@...nel.org>
> >
> > Swap table is the new swap cache.
> >
> > Signed-off-by: Chris Li <chrisl@...nel.org>
> > Signed-off-by: Kairui Song <kasong@...cent.com>
> > ---
> >  Documentation/mm/index.rst      |  1 +
> >  Documentation/mm/swap-table.rst | 72 +++++++++++++++++++++++++++++++++
> >  MAINTAINERS                     |  1 +
> >  3 files changed, 74 insertions(+)
> >  create mode 100644 Documentation/mm/swap-table.rst
> >
> > diff --git a/Documentation/mm/index.rst b/Documentation/mm/index.rst
> > index fb45acba16ac..828ad9b019b3 100644
> > --- a/Documentation/mm/index.rst
> > +++ b/Documentation/mm/index.rst
> > @@ -57,6 +57,7 @@ documentation, or deleted if it has served its purpose.
> >     page_table_check
> >     remap_file_pages
> >     split_page_table_lock
> > +   swap-table
> >     transhuge
> >     unevictable-lru
> >     vmalloced-kernel-stacks
> > diff --git a/Documentation/mm/swap-table.rst b/Documentation/mm/swap-table.rst
> > new file mode 100644
> > index 000000000000..acae6ceb4f7b
> > --- /dev/null
> > +++ b/Documentation/mm/swap-table.rst
> > @@ -0,0 +1,72 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +:Author: Chris Li <chrisl@...nel.org>, Kairui Song <kasong@...cent.com>
> > +
> > +==========
> > +Swap Table
> > +==========
> > +
> > +Swap table implements swap cache as a per-cluster swap cache value array.
> > +
> > +Swap Entry
> > +----------
> > +
> > +A swap entry contains the information required to serve the anonymous page
> > +fault.
> > +
> > +Swap entry is encoded as two parts: swap type and swap offset.
> > +
> > +The swap type indicates which swap device to use.
> > +The swap offset is the offset of the swap file to read the page data from.
> > +
> > +Swap Cache
> > +----------
> > +
> > +Swap cache is a map to look up folios using swap entry as the key. The result
> > +value can have three possible types depending on which stage of this swap entry
> > +was in.
> > +
> > +1. NULL: This swap entry is not used.
> > +
> > +2. folio: A folio has been allocated and bound to this swap entry. This is
> > +   the transient state of swap out or swap in. The folio data can be in
> > +   the folio or swap file, or both.
>
> This doesn’t look quite right.
>
> the folio’s data must reside within the folio itself?

For swap out cases that is true. The swap in case you allocate the
folio first then read data from swap file to folio. There is a window
swap file that has the data and folio does not.

> The data might also be in a swap file, or not.

The data only in swap file is covered by "data can be in the folio or
swap file", it is an OR relationship.

I think my previous statement still stands correct considering both
swap out and swap in. Of course there is always room for improvement
to make it more clear. But folio always has the data is not true for
swap in. If you have other ways to improve it, please feel free to
suggest.


> On a 32-bit system, I’m guessing the swap table is 2 KB, which is about
> half of a page?

Yes, true. I consider that but decide to leave it out of the document.
There are a lot of other implementation details the document does not
cover, not just this aspect. This document provides a simple
abstracted view (might not cover all the detail cases). One way to
address that is add a qualification "on a 64 bit system". What do you
say? I don't want to talk about the 32 bit system having half of a
page in this document, I consider that too much detail. The 32 bit
system is pretty rare nowadays.

Chris

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ