lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABXGCsNqk6pOkocJ0ctcHssCvke2kqhzoR2BGf_Hh1hWPZATuA@mail.gmail.com>
Date: Fri, 30 Jan 2026 18:49:00 +0500
From: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
To: Linux Memory Management List <linux-mm@...ck.org>, 
	Linux List Kernel Mailing <linux-kernel@...r.kernel.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>, Vlastimil Babka <vbabka@...e.cz>, chrisl@...nel.org, 
	kasong@...cent.com, Hugh Dickins <hughd@...gle.com>
Subject: [RFC PATCH] mm/page_alloc: fix use-after-free in swap due to stale
 page data after split_page()

Hi,

I've been debugging a use-after-free bug in the swap subsystem that manifests
as a crash in free_swap_count_continuations() during swapoff on zram devices.

== Problem ==

KASAN reports wild-memory-access at address 0xdead000000000100 (LIST_POISON1):

  Oops: general protection fault, probably for non-canonical address
0xfbd59c0000000020
  KASAN: maybe wild-memory-access in range
[0xdead000000000100-0xdead000000000107]
  RIP: 0010:__do_sys_swapoff+0x1151/0x1860

  RBP: dead0000000000f8
  R13: dead000000000100

The crash occurs when free_swap_count_continuations() iterates over a
list_head containing LIST_POISON values from a previous list_del().

== Root Cause ==

The swap subsystem uses vmalloc_to_page() to get struct page pointers for
the swap_map array, then uses page->private and page->lru for swap count
continuation lists.

When vmalloc allocates high-order pages without __GFP_COMP and splits them
via split_page(), the resulting pages may contain stale data:

1. post_alloc_hook() only clears page->private for the head page (page[0])
2. split_page() only calls set_page_refcounted() for tail pages
3. Tail pages retain whatever was in page->private and page->lru from
   previous use - including LIST_POISON values from prior list_del() calls

In add_swap_count_continuation() (mm/swapfile.c):

    if (!page_private(head)) {
        INIT_LIST_HEAD(&head->lru);
        set_page_private(head, SWP_CONTINUED);
    }

If head is a vmalloc tail page with stale non-zero page->private, the
INIT_LIST_HEAD is skipped, leaving page->lru with poison values. When
free_swap_count_continuations() later iterates this list, it crashes.

The comment at line 3862 says "Page allocation does not initialize the
page's lru field, but it does always reset its private field" - this
assumption is incorrect for vmalloc pages obtained via split_page().

== Proposed Fix ==

Initialize page->private and page->lru for all pages in split_page().
This matches the documented expectation in mm/vmalloc.c:

  "High-order allocations must be able to be treated as independent
   small pages by callers... Some drivers do their own refcounting
   on vmalloc_to_page() pages, some use page->mapping, page->lru, etc."

--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3122,6 +3122,16 @@ void split_page(struct page *page, unsigned int order)
        VM_BUG_ON_PAGE(PageCompound(page), page);
        VM_BUG_ON_PAGE(!page_count(page), page);

+       /*
+        * Split pages may contain stale data from previous use. Initialize
+        * page->private and page->lru which may have LIST_POISON values.
+        */
+       INIT_LIST_HEAD(&page->lru);
+       for (i = 1; i < (1 << order); i++) {
+               set_page_private(page + i, 0);
+               INIT_LIST_HEAD(&page[i].lru);
+       }
+
        for (i = 1; i < (1 << order); i++)
                set_page_refcounted(page + i);
        split_page_owner(page, order, 0);

== Testing ==

Reproduced with a stress test cycling swapon/swapoff on 8GB zram under
memory pressure:
  - Without patch: crash within ~50 iterations
  - With patch: 1154+ iterations, no crash

The bug was originally discovered on Fedora 44 with kernel 6.19.0-rc7
during normal system shutdown after extended use.

== Questions ==

1. Is split_page() the right place for this fix, or should the swap code
   be more defensive about uninitialized vmalloc pages?

2. Should prep_new_page()/post_alloc_hook() initialize all pages in
   high-order allocations, not just the head?

3. Are there other fields besides page->private and page->lru that
   callers of split_page() might expect to be initialized?

Thoughts?

-- 
Best Regards,
Mike Gavrilov.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ