[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMgjq7CJO-GdmZGN7_xG6gtneAv3Wv8qv6FjE9udh18_qmCgRA@mail.gmail.com>
Date: Mon, 28 Apr 2025 23:31:59 +0800
From: Kairui Song <ryncsn@...il.com>
To: Heiko Carstens <hca@...ux.ibm.com>
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
Chris Li <chrisl@...nel.org>, Barry Song <v-songbaohua@...o.com>,
Hugh Dickins <hughd@...gle.com>, Yosry Ahmed <yosryahmed@...gle.com>,
"Huang, Ying" <ying.huang@...ux.alibaba.com>, Baoquan He <bhe@...hat.com>,
Nhat Pham <nphamcs@...il.com>, Johannes Weiner <hannes@...xchg.org>,
Baolin Wang <baolin.wang@...ux.alibaba.com>, Kalesh Singh <kaleshsingh@...gle.com>,
Matthew Wilcox <willy@...radead.org>, linux-kernel@...r.kernel.org,
linux-s390@...r.kernel.org
Subject: Re: [PATCH v3 6/7] mm, swap: remove swap slot cache
On Mon, Apr 28, 2025 at 9:53 PM Heiko Carstens <hca@...ux.ibm.com> wrote:
>
> Hi Kairui,
>
> On Fri, Mar 14, 2025 at 12:59:34AM +0800, Kairui Song wrote:
> > From: Kairui Song <kasong@...cent.com>
> >
> > Slot cache is no longer needed now, removing it and all related code.
> ...
> > Signed-off-by: Kairui Song <kasong@...cent.com>
> > Reviewed-by: Baoquan He <bhe@...hat.com>
> > ---
> > include/linux/swap.h | 3 -
> > include/linux/swap_slots.h | 28 ----
> > mm/Makefile | 2 +-
> > mm/swap_slots.c | 295 -------------------------------------
> > mm/swap_state.c | 8 +-
> > mm/swapfile.c | 194 ++++++++----------------
> > 6 files changed, 67 insertions(+), 463 deletions(-)
> > delete mode 100644 include/linux/swap_slots.h
> > delete mode 100644 mm/swap_slots.c
> ...
> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> ...
> > +swp_entry_t folio_alloc_swap(struct folio *folio)
> > {
> > - int order = swap_entry_order(entry_order);
> > - unsigned long size = 1 << order;
> > + unsigned int order = folio_order(folio);
> > + unsigned int size = 1 << order;
> > struct swap_info_struct *si, *next;
> > - int n_ret = 0;
> > + swp_entry_t entry = {};
> > + unsigned long offset;
> > int node;
> >
> > + if (order) {
> > + /*
> > + * Should not even be attempting large allocations when huge
> > + * page swap is disabled. Warn and fail the allocation.
> > + */
> > + if (!IS_ENABLED(CONFIG_THP_SWAP) || size > SWAPFILE_CLUSTER) {
> > + VM_WARN_ON_ONCE(1);
> > + return entry;
> > + }
> > + }
>
> This warning triggers on s390. CONFIG_THP_SWAP is disabled and order
> is 8 when this triggers (reproduced with ltp's swapon01 test case):
Hi Heiko,
Thanks for the report.
>
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 895 at mm/swapfile.c:1227 folio_alloc_swap+0x438/0x440
> Modules linked in:
> CPU: 1 UID: 0 PID: 895 Comm: swapon01 Not tainted 6.14.0-rc6-00227-g0ff67f990bd4-dirty #25
> Hardware name: IBM 3931 A01 704 (z/VM 7.4.0)
> Krnl PSW : 0704d00180000000 000003ffe051210c (folio_alloc_swap+0x43c/0x440)
> R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
> Krnl GPRS: 0000000080000000 0000000000000001 0000000000000013 0000000000070000
> 0000000000000006 fffffef40e9da000 0000000000000000 0000037202fc4000
> 0000037f00000100 0000000000000100 0000037fe2e4b770 0000037202fc4000
> 0000000000000000 0000000000000000 000003ffe0512108 0000037fe2e4b3c8
> Krnl Code: 000003ffe05120fe: b9160044 llgfr %r4,%r4
> 000003ffe0512102: c0e5ffdf8c0b brasl %r14,000003ffe0103918
> #000003ffe0512108: af000000 mc 0,0
> >000003ffe051210c: a7f4fe94 brc 15,000003ffe0511e34
> 000003ffe0512110: c0040069ce74 brcl 0,000003ffe124bdf8
> 000003ffe0512116: eb8ff0580024 stmg %r8,%r15,88(%r15)
> 000003ffe051211c: b90400ef lgr %r14,%r15
> 000003ffe0512120: e3f0ffb8ff71 lay %r15,-72(%r15)
> Call Trace:
> [<000003ffe051210c>] folio_alloc_swap+0x43c/0x440
> [<000003ffe050afa6>] add_to_swap+0x56/0xf0
> [<000003ffe045fdc0>] shrink_folio_list+0xe80/0x13b0
> [<000003ffe0461946>] shrink_inactive_list+0x1a6/0x550
> [<000003ffe04624a2>] shrink_lruvec+0x2b2/0x410
> [<000003ffe0462840>] shrink_node_memcgs+0x240/0x2d0
> [<000003ffe0462986>] shrink_node+0xb6/0x3e0
> [<000003ffe046302a>] do_try_to_free_pages+0xda/0x610
> [<000003ffe0464d2c>] try_to_free_mem_cgroup_pages+0x14c/0x2a0
> [<000003ffe0568270>] try_charge_memcg+0x220/0x5d0
> [<000003ffe056867a>] charge_memcg+0x5a/0x270
> [<000003ffe056a484>] __mem_cgroup_charge+0x44/0x80
> [<000003ffe04acf20>] alloc_anon_folio+0x280/0x610
> [<000003ffe04ad45a>] do_anonymous_page+0x1aa/0x5e0
> [<000003ffe04af4c4>] __handle_mm_fault+0x244/0x500
> [<000003ffe04af820>] handle_mm_fault+0xa0/0x170
> [<000003ffe01533f8>] do_exception+0x1d8/0x4a0
> [<000003ffe11fb92a>] __do_pgm_check+0x13a/0x220
> [<000003ffe120c3ce>] pgm_check_handler+0x11e/0x170
> ---[ end trace 0000000000000000 ]---
>
The !CONFIG_THP_SWAP check existed before because slot cache should
reject high order allocation. But slot cache is gone, so large
allocation will directly go to the allocator.
It was not a meaningful WARN in the first place, and now the allocator
should just fail silently for high order allocation, that's totally
fine and expected and will just inform the caller to split the folio.
I'll just change the WARN_ON condition to `if (order && size >
SWAPFILE_CLUSTER)` then, this should silence the WARN.
Powered by blists - more mailing lists