lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMgjq7CJO-GdmZGN7_xG6gtneAv3Wv8qv6FjE9udh18_qmCgRA@mail.gmail.com>
Date: Mon, 28 Apr 2025 23:31:59 +0800
From: Kairui Song <ryncsn@...il.com>
To: Heiko Carstens <hca@...ux.ibm.com>
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>, 
	Chris Li <chrisl@...nel.org>, Barry Song <v-songbaohua@...o.com>, 
	Hugh Dickins <hughd@...gle.com>, Yosry Ahmed <yosryahmed@...gle.com>, 
	"Huang, Ying" <ying.huang@...ux.alibaba.com>, Baoquan He <bhe@...hat.com>, 
	Nhat Pham <nphamcs@...il.com>, Johannes Weiner <hannes@...xchg.org>, 
	Baolin Wang <baolin.wang@...ux.alibaba.com>, Kalesh Singh <kaleshsingh@...gle.com>, 
	Matthew Wilcox <willy@...radead.org>, linux-kernel@...r.kernel.org, 
	linux-s390@...r.kernel.org
Subject: Re: [PATCH v3 6/7] mm, swap: remove swap slot cache

On Mon, Apr 28, 2025 at 9:53 PM Heiko Carstens <hca@...ux.ibm.com> wrote:
>
> Hi Kairui,
>
> On Fri, Mar 14, 2025 at 12:59:34AM +0800, Kairui Song wrote:
> > From: Kairui Song <kasong@...cent.com>
> >
> > Slot cache is no longer needed now, removing it and all related code.
> ...
> > Signed-off-by: Kairui Song <kasong@...cent.com>
> > Reviewed-by: Baoquan He <bhe@...hat.com>
> > ---
> >  include/linux/swap.h       |   3 -
> >  include/linux/swap_slots.h |  28 ----
> >  mm/Makefile                |   2 +-
> >  mm/swap_slots.c            | 295 -------------------------------------
> >  mm/swap_state.c            |   8 +-
> >  mm/swapfile.c              | 194 ++++++++----------------
> >  6 files changed, 67 insertions(+), 463 deletions(-)
> >  delete mode 100644 include/linux/swap_slots.h
> >  delete mode 100644 mm/swap_slots.c
> ...
> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> ...
> > +swp_entry_t folio_alloc_swap(struct folio *folio)
> >  {
> > -     int order = swap_entry_order(entry_order);
> > -     unsigned long size = 1 << order;
> > +     unsigned int order = folio_order(folio);
> > +     unsigned int size = 1 << order;
> >       struct swap_info_struct *si, *next;
> > -     int n_ret = 0;
> > +     swp_entry_t entry = {};
> > +     unsigned long offset;
> >       int node;
> >
> > +     if (order) {
> > +             /*
> > +              * Should not even be attempting large allocations when huge
> > +              * page swap is disabled. Warn and fail the allocation.
> > +              */
> > +             if (!IS_ENABLED(CONFIG_THP_SWAP) || size > SWAPFILE_CLUSTER) {
> > +                     VM_WARN_ON_ONCE(1);
> > +                     return entry;
> > +             }
> > +     }
>
> This warning triggers on s390. CONFIG_THP_SWAP is disabled and order
> is 8 when this triggers (reproduced with ltp's swapon01 test case):

Hi Heiko,

Thanks for the report.

>
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 895 at mm/swapfile.c:1227 folio_alloc_swap+0x438/0x440
> Modules linked in:
> CPU: 1 UID: 0 PID: 895 Comm: swapon01 Not tainted 6.14.0-rc6-00227-g0ff67f990bd4-dirty #25
> Hardware name: IBM 3931 A01 704 (z/VM 7.4.0)
> Krnl PSW : 0704d00180000000 000003ffe051210c (folio_alloc_swap+0x43c/0x440)
>            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
> Krnl GPRS: 0000000080000000 0000000000000001 0000000000000013 0000000000070000
>            0000000000000006 fffffef40e9da000 0000000000000000 0000037202fc4000
>            0000037f00000100 0000000000000100 0000037fe2e4b770 0000037202fc4000
>            0000000000000000 0000000000000000 000003ffe0512108 0000037fe2e4b3c8
> Krnl Code: 000003ffe05120fe: b9160044            llgfr   %r4,%r4
>            000003ffe0512102: c0e5ffdf8c0b        brasl   %r14,000003ffe0103918
>           #000003ffe0512108: af000000            mc      0,0
>           >000003ffe051210c: a7f4fe94            brc     15,000003ffe0511e34
>            000003ffe0512110: c0040069ce74        brcl    0,000003ffe124bdf8
>            000003ffe0512116: eb8ff0580024        stmg    %r8,%r15,88(%r15)
>            000003ffe051211c: b90400ef            lgr     %r14,%r15
>            000003ffe0512120: e3f0ffb8ff71        lay     %r15,-72(%r15)
> Call Trace:
>  [<000003ffe051210c>] folio_alloc_swap+0x43c/0x440
>  [<000003ffe050afa6>] add_to_swap+0x56/0xf0
>  [<000003ffe045fdc0>] shrink_folio_list+0xe80/0x13b0
>  [<000003ffe0461946>] shrink_inactive_list+0x1a6/0x550
>  [<000003ffe04624a2>] shrink_lruvec+0x2b2/0x410
>  [<000003ffe0462840>] shrink_node_memcgs+0x240/0x2d0
>  [<000003ffe0462986>] shrink_node+0xb6/0x3e0
>  [<000003ffe046302a>] do_try_to_free_pages+0xda/0x610
>  [<000003ffe0464d2c>] try_to_free_mem_cgroup_pages+0x14c/0x2a0
>  [<000003ffe0568270>] try_charge_memcg+0x220/0x5d0
>  [<000003ffe056867a>] charge_memcg+0x5a/0x270
>  [<000003ffe056a484>] __mem_cgroup_charge+0x44/0x80
>  [<000003ffe04acf20>] alloc_anon_folio+0x280/0x610
>  [<000003ffe04ad45a>] do_anonymous_page+0x1aa/0x5e0
>  [<000003ffe04af4c4>] __handle_mm_fault+0x244/0x500
>  [<000003ffe04af820>] handle_mm_fault+0xa0/0x170
>  [<000003ffe01533f8>] do_exception+0x1d8/0x4a0
>  [<000003ffe11fb92a>] __do_pgm_check+0x13a/0x220
>  [<000003ffe120c3ce>] pgm_check_handler+0x11e/0x170
> ---[ end trace 0000000000000000 ]---
>

The !CONFIG_THP_SWAP check existed before because slot cache should
reject high order allocation. But slot cache is gone, so large
allocation will directly go to the allocator.

It was not a meaningful WARN in the first place, and now the allocator
should just fail silently for high order allocation, that's totally
fine and expected and will just inform the caller to split the folio.

I'll just change the WARN_ON condition to `if (order && size >
SWAPFILE_CLUSTER)` then, this should silence the WARN.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ