linux-kernel - Re: [PATCH] mm: swap: Avoid infinite loop if no valid swap entry found during do_swap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAGsJ_4zLw5+A+0gaeubBSLuL1EcaHgFa41dt+BG3VgmPsF=Ocw@mail.gmail.com>
Date: Mon, 24 Feb 2025 20:11:47 +1300
From: Barry Song <21cnbao@...il.com>
To: mawupeng <mawupeng1@...wei.com>
Cc: willy@...radead.org, akpm@...ux-foundation.org, david@...hat.com, 
	kasong@...cent.com, ryan.roberts@....com, chrisl@...nel.org, 
	huang.ying.caritas@...il.com, schatzberg.dan@...il.com, hanchuanhua@...o.com, 
	linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: swap: Avoid infinite loop if no valid swap entry
 found during do_swap_page

On Mon, Feb 24, 2025 at 2:27 PM mawupeng <mawupeng1@...wei.com> wrote:
>
>
>
> On 2025/2/23 14:18, Barry Song wrote:
> > On Sun, Feb 23, 2025 at 3:42 PM Matthew Wilcox <willy@...radead.org> wrote:
> >>
> >> On Sat, Feb 22, 2025 at 11:59:53AM +0800, mawupeng wrote:
> >>>
> >>>
> >>> On 2025/2/22 11:45, Matthew Wilcox wrote:
> >>>> On Sat, Feb 22, 2025 at 10:46:17AM +0800, Wupeng Ma wrote:
> >>>>> Digging into the source, we found that the swap entry is invalid due to
> >>>>> unknown reason, and this lead to invalid swap_info_struct. Excessive log
> >>>>> printing can fill up the prioritized log space, leading to the purging of
> >>>>> originally valid logs and hindering problem troubleshooting. To make this
> >>>>> more robust, kill this task.
> >>>>
> >>>> this seems like a very bad way to fix this problem
> >>>
> >>> Sure, It's a bad way to fix this. Just a proper way to make it more robust?
> >>> Since it will produce lots of invalid and same log?
> >>
> >> We have a mechanism to prevent flooding the log: <linux/ratelimit.h>.
> >> If you grep for 'ratelimit' in include, you'll see a number of
> >> convenience functions exist; not sure whether you'll need to use the raw
> >> ratelilmit stuff, or if you can just use one of the prepared ones.
> >>
> >
> > IMHO, I really don’t think log flooding is the issue here; rather, we’re dealing
> > with an endless page fault. For servers, that might mean server is unresponsive
> > , for phones, they could be quickly running out of battery.
>
> Yes, log flooding is not the main issue here, endless #PF is rather a more serious
> problem.
>

Please send a V2 and update your changelog to accurately describe the real
issue. Additionally, clarify how frequently this occurs and why resolving
the root cause is challenging. Gaoxu reported a similar case on the Android
kernel 6.6, while you're reporting it on 5.10. He observed an occurrence
rate of 1 in 500,000 over a week on customer devices but was unable to
reproduce it in the lab.

BTW, your patch is incorrect, as normally we could have a case _swap_info_get()
returns NULL:
thread 1                                           thread2


1. page fault happens
with entry points to
swapfile;
                                                       swapoff()
2. do_swap_page()

In this scenario, _swap_info_get() may return NULL, which is expected,
and we should not return -ERRNO—the subsequent page fault  will
detect that the PTE has changed. Since you have never enabled any
swap, the appropriate action is to do the following:

        /* Prevent swapoff from happening to us. */
        si = get_swap_device(entry);
-       if (unlikely(!si))
+       if unlikely(!si)) {
+                      /*
 +                     * Return VM_FAULT_SIGBUS if the swap entry points to
+                      * a never-enabled swap file, caused by either hardware
+                      * issues or a kernel bug. Return an error code to prevent
+                      * an infinite page fault (#PF) loop.
+               if (WARN_ON_ONCE(!swp_swap_info(entry)))
+                       ret = VM_FAULT_SIGBUS;
                goto out;
+       }


> >
> > It’s certainly better to identify the root cause, but it could be due
> > to a bit-flip in
> > DDR or memory corruption in the page table. Until we can properly fix it, the
> > patch seems somewhat reasonable—the wrong application gets killed, it at
> > least has a chance to be restarted by systemd, Android init, etc. A PTE pointing
> > to a non-existent swap file and never being enabled clearly indicates something
> > has gone seriously wrong - either a hardware issue or a kernel bug.
> > At the very least, it warrants a WARN_ON_ONCE(), even after we identify and fix
> > the root cause, as it still enhances the system's robustness.
> >
> > Gaoxu will certainly encounter the same problem if do_swap_page() executes
> > earlier than swap_duplicate() where the PTE points to a non-existent swap
> > file [1]. That means the phone will heat up quickly.
> >
> > [1] https://lore.kernel.org/linux-mm/e223b0e6ba2f4924984b1917cc717bd5@honor.com/
> >

Thanks
Barry