[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aEvJ3qWQfJwBh3oj@yjaykim-PowerEdge-T330>
Date: Fri, 13 Jun 2025 15:49:02 +0900
From: YoungJun Park <youngjun.park@....com>
To: Kairui Song <ryncsn@...il.com>
Cc: linux-mm@...ck.org, akpm@...ux-foundation.org, hannes@...xchg.org,
mhocko@...nel.org, roman.gushchin@...ux.dev, shakeel.butt@...ux.dev,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
shikemeng@...weicloud.com, nphamcs@...il.com, bhe@...hat.com,
baohua@...nel.org, chrisl@...nel.org, muchun.song@...ux.dev,
iamjoonsoo.kim@....com, taejoon.song@....com, gunho.lee@....com
Subject: Re: [RFC PATCH 2/2] mm: swap: apply per cgroup swap priority
mechansim on swap layer
On Thu, Jun 12, 2025 at 07:14:20PM +0800, Kairui Song wrote:
> On Thu, Jun 12, 2025 at 6:43 PM <youngjun.park@....com> wrote:
> >
> > From: "youngjun.park" <youngjun.park@....com>
> >
>
> Hi, Youngjun,
>
> Thanks for sharing this series.
>
> > This patch implements swap device selection and swap on/off propagation
> > when a cgroup-specific swap priority is set.
> >
> > There is one workaround to this implementation as follows.
> > Current per-cpu swap cluster enforces swap device selection based solely
> > on CPU locality, overriding the swap cgroup's configured priorities.
>
> I've been thinking about this, we can switch to a per-cgroup-per-cpu
> next cluster selector, the problem with current code is that swap
> allocator is not designed with folio / cgroup in mind at all, so it's
> really ugly to implement, which is why I have following two patches in
> the swap table series:
This seems to be the suitable alternative for upstream at the moment.
I think there are still a few things that need to be considered, though.
(Nhat pointed it out well. I've share my thoughts on that context. )
> https://lore.kernel.org/linux-mm/20250514201729.48420-18-ryncsn@gmail.com/
> https://lore.kernel.org/linux-mm/20250514201729.48420-22-ryncsn@gmail.com/
>
> The first one makes all swap allocation starts with a folio, the
> second one makes the allocator always folio aware. So you can know
> which cgroup is doing the allocation at anytime inside the allocator
> (and it reduced the number of argument, also improving performance :)
> )
> So the allocator can just use cgroup's swap info if available, plist,
> percpu cluster, and fallback to global locality in a very natural way.
>
Wow! This is exactly the situation I needed.
I thought it was uncomfortable to pass memcg parameter.
If memcg can be naturally identified within the allocation, as you mentioned,
It would be good both performance-wise and design-wise.
> > Therefore, when a swap cgroup priority is assigned, we fall back to
> > using per-CPU clusters per swap device, similar to the previous behavior.
> >
> > A proper fix for this workaround will be evaluated in the next patch.
>
> Hmm, but this is already the last patch in the series?
Ah! The next patch series refers to the one.
I'm still evaluating this part and wasn't confident enough to include it
in the current version.
At first, I wanted to get feedback on the core part, I'm currently pursuing.
Powered by blists - more mailing lists