linux-kernel - Re: [PATCH v2] mm: swap: async free swap slot cache entries

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF8kJuNvB8gXv3kj2nkN5j2ny0ZjJoVEdkeDDWSuWxySkKE=1g@mail.gmail.com>
Date: Sat, 3 Feb 2024 10:12:08 -0800
From: Chris Li <chrisl@...nel.org>
To: Tim Chen <tim.c.chen@...ux.intel.com>
Cc: "Huang, Ying" <ying.huang@...el.com>, Andrew Morton <akpm@...ux-foundation.org>, 
	linux-kernel@...r.kernel.org, linux-mm@...ck.org, 
	Wei Xu <weixugc@...gle.com>, 
	Yu Zhao <yuzhao@...gle.com>, 
	Greg Thelen <gthelen@...gle.com>, Chun-Tse Shao <ctshao@...gle.com>, 
	Suren Baghdasaryan <surenb@...gle.com>, 
	Yosry Ahmed <yosryahmed@...gle.com>, 
	Brain Geffon <bgeffon@...gle.com>, Minchan Kim <minchan@...nel.org>, Michal Hocko <mhocko@...e.com>, 
	Mel Gorman <mgorman@...hsingularity.net>, Nhat Pham <nphamcs@...il.com>, 
	Johannes Weiner <hannes@...xchg.org>, Kairui Song <kasong@...cent.com>, 
	Zhongkun He <hezhongkun.hzk@...edance.com>, Kemeng Shi <shikemeng@...weicloud.com>, 
	Barry Song <v-songbaohua@...o.com>
Subject: Re: [PATCH v2] mm: swap: async free swap slot cache entries

On Thu, Feb 1, 2024 at 3:21 PM Tim Chen <tim.c.chen@...ux.intel.com> wrote:
>
> On Thu, 2024-02-01 at 13:33 +0800, Huang, Ying wrote:
> > Chris Li <chrisl@...nel.org> writes:
> >
> > >
> > > Changes in v2:
> > > - Add description of the impact of time changing suggest by Ying.
> > > - Remove create_workqueue() and use schedule_work()
> > > - Link to v1: https://lore.kernel.org/r/20231221-async-free-v1-1-94b277992cb0@kernel.org
> > > ---
> > >  include/linux/swap_slots.h |  1 +
> > >  mm/swap_slots.c            | 29 +++++++++++++++++++++--------
> > >  2 files changed, 22 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/include/linux/swap_slots.h b/include/linux/swap_slots.h
> > > index 15adfb8c813a..67bc8fa30d63 100644
> > > --- a/include/linux/swap_slots.h
> > > +++ b/include/linux/swap_slots.h
> > > @@ -19,6 +19,7 @@ struct swap_slots_cache {
> > >     spinlock_t      free_lock;  /* protects slots_ret, n_ret */
> > >     swp_entry_t     *slots_ret;
> > >     int             n_ret;
> > > +   struct work_struct async_free;
> > >  };
> > >
> > >  void disable_swap_slots_cache_lock(void);
> > > diff --git a/mm/swap_slots.c b/mm/swap_slots.c
> > > index 0bec1f705f8e..71d344564e55 100644
> > > --- a/mm/swap_slots.c
> > > +++ b/mm/swap_slots.c
> > > @@ -44,6 +44,7 @@ static DEFINE_MUTEX(swap_slots_cache_mutex);
> > >  static DEFINE_MUTEX(swap_slots_cache_enable_mutex);
> > >
> > >  static void __drain_swap_slots_cache(unsigned int type);
> > > +static void swapcache_async_free_entries(struct work_struct *data);
> > >
> > >  #define use_swap_slot_cache (swap_slot_cache_active && swap_slot_cache_enabled)
> > >  #define SLOTS_CACHE 0x1
> > > @@ -149,6 +150,7 @@ static int alloc_swap_slot_cache(unsigned int cpu)
> > >             spin_lock_init(&cache->free_lock);
> > >             cache->lock_initialized = true;
> > >     }
> > > +   INIT_WORK(&cache->async_free, swapcache_async_free_entries);
> > >     cache->nr = 0;
> > >     cache->cur = 0;
> > >     cache->n_ret = 0;
> > > @@ -269,6 +271,20 @@ static int refill_swap_slots_cache(struct swap_slots_cache *cache)
> > >     return cache->nr;
> > >  }
> > >
> > > +static void swapcache_async_free_entries(struct work_struct *data)
> > > +{
> > > +   struct swap_slots_cache *cache;
> > > +
> > > +   cache = container_of(data, struct swap_slots_cache, async_free);
> > > +   spin_lock_irq(&cache->free_lock);
> > > +   /* Swap slots cache may be deactivated before acquiring lock */
> > > +   if (cache->slots_ret) {
> > > +           swapcache_free_entries(cache->slots_ret, cache->n_ret);
> > > +           cache->n_ret = 0;
> > > +   }
> > > +   spin_unlock_irq(&cache->free_lock);
> > > +}
> > > +
> > >  void free_swap_slot(swp_entry_t entry)
> > >  {
> > >     struct swap_slots_cache *cache;
> > > @@ -282,17 +298,14 @@ void free_swap_slot(swp_entry_t entry)
> > >                     goto direct_free;
> > >             }
> > >             if (cache->n_ret >= SWAP_SLOTS_CACHE_SIZE) {
> > > -                   /*
> > > -                    * Return slots to global pool.
> > > -                    * The current swap_map value is SWAP_HAS_CACHE.
> > > -                    * Set it to 0 to indicate it is available for
> > > -                    * allocation in global pool
> > > -                    */
> > > -                   swapcache_free_entries(cache->slots_ret, cache->n_ret);
> > > -                   cache->n_ret = 0;
> > > +                   spin_unlock_irq(&cache->free_lock);
> > > +                   schedule_work(&cache->async_free);
> > > +                   goto direct_free;
> > >             }
> > >             cache->slots_ret[cache->n_ret++] = entry;
> > >             spin_unlock_irq(&cache->free_lock);
> > > +           if (cache->n_ret >= SWAP_SLOTS_CACHE_SIZE)
> > > +                   schedule_work(&cache->async_free);
>
>
> I have some concerns about the current patch with the change above.
> We could hit the direct_free path very often.
>
> By delaying the freeing of entries in the return
> cache, we have to do more freeing of swap entry one at a time. When
> we try to free an entry, we can find the return cache still full, waiting to be freed.

You are describing the async free is not working. In that case it will always
hit the direct free path one by one.

>
> So we have fewer batch free of swap entries, resulting in an increase in
> number of sis->lock acquisitions overall. This could have the
> effect of reducing swap throughput overall when swap is under heavy
> operations and sis->lock is contended.

I  can change the direct free path to free all entries. If the async
free hasn't freed up the batch by the time the next swap fault comes in.
The new swap fault will take the hit, just free the whole batch. It will behave
closer to the original batch free behavior in this path.

Chris