[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CACePvbWiGZEuR1xHorjS2mXE-=Z4ZfpR8U_1jSMGMBe8PFnU_g@mail.gmail.com>
Date: Mon, 15 Sep 2025 11:03:01 -0700
From: Chris Li <chrisl@...nel.org>
To: Chris Mason <clm@...a.com>
Cc: Kairui Song <ryncsn@...il.com>, linux-mm@...ck.org,
Andrew Morton <akpm@...ux-foundation.org>, Matthew Wilcox <willy@...radead.org>,
Hugh Dickins <hughd@...gle.com>, Barry Song <baohua@...nel.org>, Baoquan He <bhe@...hat.com>,
Nhat Pham <nphamcs@...il.com>, Kemeng Shi <shikemeng@...weicloud.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>, Ying Huang <ying.huang@...ux.alibaba.com>,
Johannes Weiner <hannes@...xchg.org>, David Hildenbrand <david@...hat.com>,
Yosry Ahmed <yosryahmed@...gle.com>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Zi Yan <ziy@...dia.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 14/15] mm, swap: implement dynamic allocation of swap table
On Mon, Sep 15, 2025 at 10:14 AM Chris Li <chrisl@...nel.org> wrote:
>
> On Mon, Sep 15, 2025 at 10:00 AM Chris Mason <clm@...a.com> wrote:
> >
> >
> >
> > On 9/15/25 12:24 PM, Kairui Song wrote:
> > > On Mon, Sep 15, 2025 at 11:55 PM Chris Mason <clm@...a.com> wrote:
> > >>
> > >> On Thu, 11 Sep 2025 00:08:32 +0800 Kairui Song <ryncsn@...il.com> wrote:
> >
> > [ ... ]
> > spin_lock(&si->global_cluster_lock);
> > >>> + /*
> > >>> + * Back to atomic context. First, check if we migrated to a new
> > >>> + * CPU with a usable percpu cluster. If so, try using that instead.
> > >>> + * No need to check it for the spinning device, as swap is
> > >>> + * serialized by the global lock on them.
> > >>> + *
> > >>> + * The is_usable check is a bit rough, but ensures order 0 success.
> > >>> + */
> > >>> + offset = this_cpu_read(percpu_swap_cluster.offset[order]);
> > >>> + if ((si->flags & SWP_SOLIDSTATE) && offset) {
> > >>> + pcp_ci = swap_cluster_lock(si, offset);
> > >>> + if (cluster_is_usable(pcp_ci, order) &&
> > >>> + pcp_ci->count < SWAPFILE_CLUSTER) {
> > >>> + ci = pcp_ci;
> > >> ^^^^^^^^^^^^^
> > >> ci came from the caller, and in the case of isolate_lock_cluster() they
> > >> had just removed it from a list. We overwrite ci and return something
> > >> different.
> > >
> > > Yes, that's expected. See the comment above. We have just dropped
> > > local lock so it's possible that we migrated to another CPU which has
> > > its own percpu cache ci (percpu_swap_cluster.offset).
> > >
> > > To avoid fragmentation, drop the isolated ci and use the percpu ci
> > > instead. But you are right that I need to add the ci back to the list,
> > > or it will be leaked. Thanks!
> >
> > Yeah, the comment helped a lot (thank you). It was just the leak I was
> > worried about ;)
>
> As Kairui said, that is not a leak, it is the intended behavior. It
> rotates the listhead when fetching the ci from the list to avoid
> repeatedly trying some fragment cluster which has a very low success
> rate. Otherwise it can stall on the same fragmented list. It does look
> odd at the first glance. That is the best we can do so far without
> introducing a lot of repeating rotation logic to the caller. If you
> find other ways to improve the reading without performance penalty and
> make code simpler, feel free to make suggestions or even patches.
Sorry I take back what I just said. There might be a real leak as you
point out. My bad.
Chris
Powered by blists - more mailing lists