lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGsJ_4we4ZfNqJ+v7+=0hjNKLakJ-s8qtRsGo_kp0R_th7Xvkw@mail.gmail.com>
Date: Wed, 3 Sep 2025 11:31:00 +1200
From: Barry Song <21cnbao@...il.com>
To: Chris Li <chrisl@...nel.org>
Cc: Kairui Song <kasong@...cent.com>, linux-mm@...ck.org, 
	Andrew Morton <akpm@...ux-foundation.org>, Matthew Wilcox <willy@...radead.org>, 
	Hugh Dickins <hughd@...gle.com>, Baoquan He <bhe@...hat.com>, Nhat Pham <nphamcs@...il.com>, 
	Kemeng Shi <shikemeng@...weicloud.com>, Baolin Wang <baolin.wang@...ux.alibaba.com>, 
	Ying Huang <ying.huang@...ux.alibaba.com>, Johannes Weiner <hannes@...xchg.org>, 
	David Hildenbrand <david@...hat.com>, Yosry Ahmed <yosryahmed@...gle.com>, 
	Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, Zi Yan <ziy@...dia.com>, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 8/9] mm, swap: implement dynamic allocation of swap table

On Wed, Sep 3, 2025 at 1:17 AM Chris Li <chrisl@...nel.org> wrote:
>
> On Tue, Sep 2, 2025 at 4:15 AM Barry Song <21cnbao@...il.com> wrote:
> >
> > On Sat, Aug 23, 2025 at 3:21 AM Kairui Song <ryncsn@...il.com> wrote:
> > >
> > > From: Kairui Song <kasong@...cent.com>
> > >
> > > Now swap table is cluster based, which means free clusters can free its
> > > table since no one should modify it.
> > >
> > > There could be speculative readers, like swap cache look up, protect
> > > them by making them RCU safe. All swap table should be filled with null
> > > entries before free, so such readers will either see a NULL pointer or
> > > a null filled table being lazy freed.
> > >
> > > On allocation, allocate the table when a cluster is used by any order.
> > >
> >
> > Might be a silly question.
> >
> > Just curious—what happens if the allocation fails? Does the swap-out
> > operation also fail? We sometimes encounter strange issues when memory is
> > very limited, especially if the reclamation path itself needs to allocate
> > memory.
> >
> > Assume a case where we want to swap out a folio using clusterN. We then
> > attempt to swap out the following folios with the same clusterN. But if
> > the allocation of the swap_table keeps failing, what will happen?
>
> I think this is the same behavior as the XArray allocation node with no memory.
> The swap allocator will fail to isolate this cluster, it gets a NULL
> ci pointer as return value. The swap allocator will try other cluster
> lists, e.g. non_full, fragment etc.

What I’m actually concerned about is that we keep iterating on this
cluster. If we try others, that sounds good.

> If all of them fail, the folio_alloc_swap() will return -ENOMEM. Which
> will propagate back to the try to swap out, then the shrink folio
> list. It will put this page back to the LRU.
>
> The shrink folio list either free enough memory (happy path) or not
> able to free enough memory and it will cause an OOM kill.
>
> I believe previously XArray will also return -ENOMEM at insert a
> pointer and not be able to allocate a node to hold that ponter. It has
> the same error poperation path. We did not change that.

Yes, I agree there was an -ENOMEM, but the difference is that we
are allocating much larger now :-)

One option is to organize every 4 or 8 swap slots into a group for
allocating or freeing the swap table. This way, we avoid the worst
case where a single unfreed slot consumes a whole swap table, and
the allocation size also becomes smaller. However, it’s unclear
whether the memory savings justify the added complexity and effort.

Anyway, I’m glad to see the current swap_table moving towards merge
and look forward to running it on various devices. This should help
us see if it causes any real issues.

Thanks
Barry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ