lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20251111231835.1232ad8f@kf-m2g5>
Date: Tue, 11 Nov 2025 23:18:35 -0600
From: Aaron Rainbolt <arraybolt3@...il.com>
To: linux-mm@...ck.org, cryptsetup@...ts.linux.dev
Cc: linux-kernel@...r.kernel.org, adrelanos@...nix.org
Subject: Hard system lock-ups when using encrypted swap and RAM is exhausted

Not sure if this is a memory management issue, a LUKS issue, or both,
so I wrote both mailing lists.

I'm seeing an issue with both the latest mainline kernel (6.18-rc5) and
Debian 13's 6.12 kernel package. When physical memory fills up, the
entire system locks up hard, as if it hit rather severe thrashing,
despite the fact that there appears to be disk cache that can still be
evicted, and there is ample amounts of swap space remaining (gigabytes
of it). This issue did not occur with the 6.1 kernel in Debian 12. I'm
seeing this occur in very low-memory Debian VMs, with between 512 and
900 MB RAM, running under VirtualBox and KVM. (I suspect, but have not
verified, that I'm seeing similar behavior under Xen as well.) These
VMs generally use a swappiness of 1, though I have seen a lockup occur
even with a swappiness of 60. The filesystem in use, in case it
matters, is ext4.

To reproduce on a system running Linux 6.18-rc5, with :

* Follow the steps from
  https://gitlab.com/cryptsetup/cryptsetup/-/wikis/FrequentlyAskedQuestions,
  section "2.3 How do I set up encrypted swap?", but creating a
  swapfile rather than a swap partition. I created an 8 GB swapfile
  with fallocate. Reboot the system when done.
* In a TTY, open a terminal multiplexer (or something you can abuse as
  one, Vim works well), and open two terminals. In one terminal, run
  `htop` so you can observe memory and swap usage.
* In the `htop` terminal, sort by M_RESIDENT.
* In the other terminal, create a new file `test.py`, that will
  gradually fill memory at a relatively fast pace and print an
  indicator that it's still alive. I used the following code for this:

    import time

    count = 0
    mem_list = []
    while True:
        mem_list.append([x for x in range(2048)])
        count += 1
        time.sleep(0.002)
        print(count)

* Run the script with `python3 test.py`.
* While the script runs, observe the growing memory usage in `htop`.
  Swap usage should start at or near 0, RAM usage will gradually
  increase. Once RAM usage starts getting high, some data will start
  being swapped out as expected, but after a short while the whole VM
  will lock up despite there being gigabytes of swap left. (On my KVM
  VM, the last time htop updated its screen, it showed RAM usage of
  712M/846M, and swap usage of 328M/7.40G. The python3 process
  running the script was consuming 551M memory. The VM is entirely
  unresponsive. Incidentally, the python3 process also was in
  uninterruptible sleep when htop last updated its screen, but that
  could mean nothing since it might have come out of sleep between the
  last screen update and the VM lockup.)

Under Bookworm with Linux 6.1, the Python script would occasionally
freeze, but the VM would remain responsive, and the script would
eventually resume. Even with kernel 6.12, both unencrypted swapfiles and
swapfiles that are technically unencrypted but live on a LUKS volume
both behave as expected. It's only swapfiles that are themselves
encrypted that seem to trigger these lockups.

I haven't looked at the code at all, but it seems like maybe memory
LUKS needs available in order to operate is being consumed, thus
making it impossible to swap anything in and out of the swapfile? That
seems like it would cause these symptoms or similar, though I don't
know.

Let me know if I can provide any further information on the issue. I'm
happy to bisect the kernel if it will help.

--
Aaron

Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ