linux-kernel - [PATCH] KVM: arm64: nv: Allow shadow stage 2 read fault

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250822031853.2007437-1-r09922117@csie.ntu.edu.tw>
Date: Fri, 22 Aug 2025 11:18:53 +0800
From: Wei-Lin Chang <r09922117@...e.ntu.edu.tw>
To: linux-arm-kernel@...ts.infradead.org,
	kvmarm@...ts.linux.dev,
	linux-kernel@...r.kernel.org
Cc: Marc Zyngier <maz@...nel.org>,
	Oliver Upton <oliver.upton@...ux.dev>,
	Joey Gouly <joey.gouly@....com>,
	Suzuki K Poulose <suzuki.poulose@....com>,
	Zenghui Yu <yuzenghui@...wei.com>,
	Catalin Marinas <catalin.marinas@....com>,
	Wei-Lin Chang <r09922117@...e.ntu.edu.tw>,
	Will Deacon <will@...nel.org>
Subject: [PATCH] KVM: arm64: nv: Allow shadow stage 2 read fault

We don't expect to ever encounter a stage-2 read permission fault,
because read permission is always granted when mapping stage-2. However,
this isn't certainly the case when NV is involved. The key is the shadow
mapping built for L2 must have the same or stricter permissions than
those that L1 built for L2, hence opening the possibility of mappings
without read permission.

Let's continue handling the read fault if we're handling a shadow
stage-2 fault, instead of erroring out.

The following illustrates an example, vertical lines stand for either
L0, L1, or L2 running, with left: L0, middle: L1, and right: L2.
Horizontal arrows stand for traps and erets between them.

'''
L0                               L1                               L2
                                               L2                 |
                                               writes to an L2 PA |
|<----------------------------------------------------------------|
|
| L0
| finds no mapping in
| L1's s2pt for L2
|
| L0
| injects data abort
|------------------------------->|
                                 | L1
                                 | for whatever reason
                                 | treats this abort specially,
                                 | only gives it W permission
                                 |
       eret traps to L0          |
|<-------------------------------|
|
|      eret back to L2
|---------------------------------------------------------------->|
                                                                  |
                                                   L2             |
                                                   writes to same |
                                                   L2 PA (replay) |
|<----------------------------------------------------------------|
|
| L0
| finds mapping in s2pt that
| L1 built for L2, create
| shadow mapping for L2,
| but only gives it W to match
| what L1 had given
|
|
|      eret back to L2
|---------------------------------------------------------------->|
                                                                  |
                                                   L2             |
                                                   writes to same |
                                                   L2 PA (replay) |
                                                   success        |
|<----------------------------------------------------------------|
|
|
|------------------------------->| L1
                                 | for some reason, relaxes
                                 | L2's permission from W
                                 | to RW, AND, doesn't flush
                                 | TLB
                                 |
       eret traps to L0          |
|<-------------------------------|
|
|      eret back to L2
|---------------------------------------------------------------->|
                                                                  |
                                                    L2            |
                                                    read to L2 PA |
|<----------------------------------------------------------------|
| stage-2 read fault
|
'''

Signed-off-by: Wei-Lin Chang <r09922117@...e.ntu.edu.tw>
---

I am able to trigger this error with a modified L1 KVM, but I do realize
this requires L1 to be very strange (or even just wrong) so I understand
if we don't want to handle this kind of edge case. On the other hand,
could there also be other ways to trigger this that I have not thought
of?

Another thing is that this change lets L1 get away with not flushing the
TLB, but TLBs are ephemeral so it's fine in this aspect, however I'm not
sure if there are other considerations.

---
 arch/arm64/kvm/mmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 1c78864767c5c..41017ca579b19 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1508,8 +1508,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
 	VM_BUG_ON(write_fault && exec_fault);
 
-	if (fault_is_perm && !write_fault && !exec_fault) {
-		kvm_err("Unexpected L2 read permission error\n");
+	if (fault_is_perm && !write_fault && !exec_fault && !nested) {
+		kvm_err("Unexpected S2 read permission error\n");
 		return -EFAULT;
 	}
 
-- 
2.50.1