lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 27 Mar 2024 14:37:27 +0800
From: Kairui Song <ryncsn@...il.com>
To: "Huang, Ying" <ying.huang@...el.com>
Cc: linux-mm@...ck.org, Chris Li <chrisl@...nel.org>, Minchan Kim <minchan@...nel.org>, 
	Barry Song <v-songbaohua@...o.com>, Ryan Roberts <ryan.roberts@....com>, 
	Yu Zhao <yuzhao@...gle.com>, SeongJae Park <sj@...nel.org>, David Hildenbrand <david@...hat.com>, 
	Yosry Ahmed <yosryahmed@...gle.com>, Johannes Weiner <hannes@...xchg.org>, 
	Matthew Wilcox <willy@...radead.org>, Nhat Pham <nphamcs@...il.com>, 
	Chengming Zhou <zhouchengming@...edance.com>, Andrew Morton <akpm@...ux-foundation.org>, 
	linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 10/10] mm/swap: optimize synchronous swapin

On Wed, Mar 27, 2024 at 2:24 PM Huang, Ying <ying.huang@...el.com> wrote:
>
> Kairui Song <ryncsn@...il.com> writes:
>
> > From: Kairui Song <kasong@...cent.com>
> >
> > Interestingly the major performance overhead of synchronous is actually
> > from the workingset nodes update, that's because synchronous swap in
>
> If it's the major overhead, why not make it the first optimization?

This performance issue became much more obvious after doing other
optimizations, and other optimizations are for general swapin not only
for synchronous swapin, that's also how I optimized things step by
step, so I kept my patch order...

And it is easier to do this after Patch 8/10 which introduces the new
interface for swap cache.

>
> > keeps adding single folios into a xa_node, making the node no longer
> > a shadow node and have to be removed from shadow_nodes, then remove
> > the folio very shortly and making the node a shadow node again,
> > so it has to add back to the shadow_nodes.
>
> The folio is removed only if should_try_to_free_swap() returns true?
>
> > Mark synchronous swapin folio with a special bit in swap entry embedded
> > in folio->swap, as we still have some usable bits there. Skip workingset
> > node update on insertion of such folio because it will be removed very
> > quickly, and will trigger the update ensuring the workingset info is
> > eventual consensus.
>
> Is this safe?  Is it possible for the shadow node to be reclaimed after
> the folio are added into node and before being removed?

If a xa node contains any non-shadow entry, it can't be reclaimed,
shadow_lru_isolate will check and skip such nodes in case of race.

>
> If so, we may consider some other methods.  Make shadow_nodes per-cpu?

That's also an alternative solution if there are other risks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ