[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260102173603.18247-1-linmag7@gmail.com>
Date: Fri, 2 Jan 2026 18:30:42 +0100
From: Magnus Lindholm <linmag7@...il.com>
To: linux-kernel@...r.kernel.org,
linux-alpha@...r.kernel.org,
hch@...radead.org,
macro@...am.me.uk,
glaubitz@...sik.fu-berlin.de,
mattst88@...il.com,
richard.henderson@...aro.org,
ink@...een.parts
Cc: Magnus Lindholm <linmag7@...il.com>
Subject: [PATCH 0/1] alpha: fix user-space corruption during memory compaction
This patch fixes long-standing user-space crashes on Alpha systems
when memory compaction is enabled.
Observed symptoms include:
- sporadic SIGSEGV in unrelated user programs
- glibc allocator failures (e.g. "unaligned tcache chunk detected")
- gcc "internal compiler error"
- heap corruption detected by malloc consistency checks
The failures occur only when page migration / compaction is active
and disappear when compaction is disabled. They affect both UP and
SMP kernels and are not specific to a particular Alpha CPU model.
Root cause
==========
Alpha relies on Address Space Numbers (ASNs) for user-space instruction
cache coherency. Existing TLB shootdown paths during page migration
primarily depend on MM context rollover, with lazy invalidation of
translations on CPUs not actively running the MM.
This approach is insufficient during page migration. Migration creates
a window where stale data or instruction translations can survive long
enough for a CPU to perform loads or stores using the wrong physical
page. This leads to silent user-space memory corruption that later
manifests as crashes.
Testing shows that the corruption is triggered during the unmap phase
of migration. Installing the fix in ptep_clear_flush() is sufficient.
No additional handling is required when installing the new mapping.
Instruction barriers were evaluated during debugging but were found
not to be required. Immediate TLB invalidation combined with ASN
rollover is sufficient to prevent stale instruction and data access.
Solution
========
This patch introduces a migration-specific TLB flush helper that
combines:
- MM context invalidation (ASN rollover),
- immediate per-CPU TLB invalidation,
- synchronous cross-CPU shootdown.
The helper is used only by the page migration / compaction unmap
path, leaving normal TLB semantics unchanged for other VM operations.
Summary
=======
This patch fixes real user-visible corruption bugs during page
migration on Alpha by making TLB shootdowns migration-safe, without
impacting non-migration code paths.
Thanks for taking a look.
Magnus Lindholm (1):
alpha: fix user-space corruption during memory compaction
arch/alpha/include/asm/pgtable.h | 33 ++++++++-
arch/alpha/include/asm/tlbflush.h | 4 +-
arch/alpha/mm/Makefile | 2 +-
arch/alpha/mm/tlbflush.c | 112 ++++++++++++++++++++++++++++++
4 files changed, 148 insertions(+), 3 deletions(-)
create mode 100644 arch/alpha/mm/tlbflush.c
--
2.51.0
Powered by blists - more mailing lists