lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKEwX=PiOdrR7Ad5XoT8pRZDLB=q6B_fmwQ3ScgWFPNptBuHPw@mail.gmail.com>
Date: Tue, 24 Sep 2024 07:32:36 -0700
From: Nhat Pham <nphamcs@...il.com>
To: Baolin Wang <baolin.wang@...ux.alibaba.com>
Cc: Yosry Ahmed <yosryahmed@...gle.com>, akpm@...ux-foundation.org, hannes@...xchg.org, 
	hughd@...gle.com, shakeel.butt@...ux.dev, ryan.roberts@....com, 
	ying.huang@...el.com, chrisl@...nel.org, david@...hat.com, kasong@...cent.com, 
	willy@...radead.org, viro@...iv.linux.org.uk, baohua@...nel.org, 
	chengming.zhou@...ux.dev, linux-mm@...ck.org, kernel-team@...a.com, 
	linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 0/2] remove SWAP_MAP_SHMEM

On Mon, Sep 23, 2024 at 8:25 PM Baolin Wang
<baolin.wang@...ux.alibaba.com> wrote:
>
>
> On 2024/9/24 10:15, Yosry Ahmed wrote:
> > On Mon, Sep 23, 2024 at 6:55 PM Baolin Wang
> > <baolin.wang@...ux.alibaba.com> wrote:
> >>
> >>
> >>
> >> On 2024/9/24 07:11, Nhat Pham wrote:
> >>> The SWAP_MAP_SHMEM state was originally introduced in the commit
> >>> aaa468653b4a ("swap_info: note SWAP_MAP_SHMEM"), to quickly determine if a
> >>> swap entry belongs to shmem during swapoff.
> >>>
> >>> However, swapoff has since been rewritten drastically in the commit
> >>> b56a2d8af914 ("mm: rid swapoff of quadratic complexity"). Now
> >>> having swap count == SWAP_MAP_SHMEM value is basically the same as having
> >>> swap count == 1, and swap_shmem_alloc() behaves analogously to
> >>> swap_duplicate()
> >>>
> >>> This RFC proposes the removal of this state and the associated helper to
> >>> simplify the state machine (both mentally and code-wise). We will also
> >>> have an extra state/special value that can be repurposed (for swap entries
> >>> that never gets re-duplicated).
> >>>
> >>> Another motivation (albeit a bit premature at the moment) is the new swap
> >>> abstraction I am currently working on, that would allow for swap/zswap
> >>> decoupling, swapoff optimization, etc. The fewer states and swap API
> >>> functions there are, the simpler the conversion will be.
> >>>
> >>> I am sending this series first as an RFC, just in case I missed something
> >>> or misunderstood this state, or if someone has a swap optimization in mind
> >>> for shmem that would require this special state.
> >>
> >> The idea makes sense to me. I did a quick test with shmem mTHP, and
> >> encountered the following warning which is triggered by
> >> 'VM_WARN_ON(usage == 1 && nr > 1)' in __swap_duplicate().
> >
> > Apparently __swap_duplicate() does not currently handle increasing the
> > swap count for multiple swap entries by 1 (i.e. usage == 1) because it
> > does not handle rolling back count increases when
> > swap_count_continued() fails.
> >
> > I guess this voids my Reviewed-by until we sort this out. Technically
> > swap_count_continued() won't ever be called for shmem because we only
> > ever increment the count by 1, but there is no way to know this in
> > __swap_duplicate() without SWAP_HAS_SHMEM.

Ah this is my bad. I compiled with CONFIG_THP_SWAP, but forgot to
remove the swapfile check (that's another can of worms, but I need
data before submitting the patch to remove it...)

One thing we can do is instead of warning here, we can handle it in
the for loop check, where we have access to count - that's the point
of having that for-loop check anyway? :)

There's a couple of ways to go about it:

1. VM_WARN_ON(usage == 1 && nr > 1 && count != 0 );
(or more accurately, (count & ~COUNT_CONTINUED) >= SWAP_MAP_MAX))

2. Alternatively, instead of warning here, we can simply return
-ENOMEM. Then, at shmem callsite, have a VM_WARN_ON/VM_BUG_ON(), since
this MUST succeed.

Either solutions should follow with careful documentation to make it
clear the expectation/guarantee of the new API.

Yosry, Baolin, how do you two feel about this? Would something like
this work? I need to test it first, but let me know if I'm missing
something.

If this does not work, we can do what Baolin is suggesting, and
perhaps maintain the swap_shmem_alloc() helper. It's less than ideal,
but at least we still lose a state...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ