[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <86edbmu8kn.wl-maz@kernel.org>
Date: Wed, 03 Apr 2024 09:58:00 +0100
From: Marc Zyngier <maz@...nel.org>
To: Gavin Shan <gshan@...hat.com>
Cc: linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org,
catalin.marinas@....com,
will@...nel.org,
akpm@...ux-foundation.org,
apopple@...dia.com,
mark.rutland@....com,
ryan.roberts@....com,
rananta@...gle.com,
yangyicong@...ilicon.com,
v-songbaohua@...o.com,
yezhenyu2@...wei.com,
yihyu@...hat.com,
shan.gavin@...il.com
Subject: Re: [PATCH] arm64: tlb: Fix TLBI RANGE operand
On Wed, 03 Apr 2024 07:49:29 +0100,
Gavin Shan <gshan@...hat.com> wrote:
>
> KVM/arm64 relies on TLBI RANGE feature to flush TLBs when the dirty
> bitmap is collected by VMM and the corresponding PTEs need to be
> write-protected again. Unfortunately, the operand passed to the TLBI
> RANGE instruction isn't correctly sorted out by commit d1d3aa98b1d4
> ("arm64: tlb: Use the TLBI RANGE feature in arm64"). It leads to
> crash on the destination VM after live migration because some of the
> dirty pages are missed.
>
> For example, I have a VM where 8GB memory is assigned, starting from
> 0x40000000 (1GB). Note that the host has 4KB as the base page size.
> All TLBs for VM can be covered by one TLBI RANGE operation. However,
> I receives 0xffff708000040000 as the operand, which is wrong and the
> correct one should be 0x00007f8000040000. From the wrong operand, we
> have 3 and 1 for SCALE (bits[45:44) and NUM (bits943:39], only 1GB
> instead of 8GB memory is covered.
>
> Fix the macro __TLBI_RANGE_NUM() so that the correct NUM and TLBI
> RANGE operand are provided.
>
> Fixes: d1d3aa98b1d4 ("arm64: tlb: Use the TLBI RANGE feature in arm64")
> Cc: stable@...nel.org # v5.10+
> Reported-by: Yihuang Yu <yihyu@...hat.com>
> Signed-off-by: Gavin Shan <gshan@...hat.com>
> ---
> arch/arm64/include/asm/tlbflush.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index 3b0e8248e1a4..07c4fb4b82b4 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -166,7 +166,7 @@ static inline unsigned long get_trans_granule(void)
> */
> #define TLBI_RANGE_MASK GENMASK_ULL(4, 0)
> #define __TLBI_RANGE_NUM(pages, scale) \
> - ((((pages) >> (5 * (scale) + 1)) & TLBI_RANGE_MASK) - 1)
> + ((((pages) >> (5 * (scale) + 1)) - 1) & TLBI_RANGE_MASK)
>
> /*
> * TLB Invalidation
This looks pretty wrong, by the very definition of the comment that's
just above:
<quote>
/*
* Generate 'num' values from -1 to 30 with -1 rejected by the
* __flush_tlb_range() loop below.
*/
</quote>
With your change, num can't ever be negative, and that breaks
__flush_tlb_range_op():
<quote>
num = __TLBI_RANGE_NUM(pages, scale); \
if (num >= 0) { \
addr = __TLBI_VADDR_RANGE(start >> shift, asid, \
scale, num, tlb_level); \
__tlbi(r##op, addr); \
if (tlbi_user) \
__tlbi_user(r##op, addr); \
start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
pages -= __TLBI_RANGE_PAGES(num, scale); \
} \
scale--; \
</quote>
We'll then shove whatever value we've found in the TLBI operation,
leading to unknown results instead of properly adjusting the scale to
issue a smaller invalidation.
I think the problem is that you are triggering NUM=31 and SCALE=3,
which the current code cannot handle as per the comment above
__flush_tlb_range_op() (we can't do NUM=30 and SCALE=4, obviously).
Can you try the untested patch below?
Thanks,
M.
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 3b0e8248e1a4..b71a1cece802 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -379,10 +379,6 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
* 3. If there is 1 page remaining, flush it through non-range operations. Range
* operations can only span an even number of pages. We save this for last to
* ensure 64KB start alignment is maintained for the LPA2 case.
- *
- * Note that certain ranges can be represented by either num = 31 and
- * scale or num = 0 and scale + 1. The loop below favours the latter
- * since num is limited to 30 by the __TLBI_RANGE_NUM() macro.
*/
#define __flush_tlb_range_op(op, start, pages, stride, \
asid, tlb_level, tlbi_user, lpa2) \
@@ -407,6 +403,7 @@ do { \
\
num = __TLBI_RANGE_NUM(pages, scale); \
if (num >= 0) { \
+ num += 1; \
addr = __TLBI_VADDR_RANGE(start >> shift, asid, \
scale, num, tlb_level); \
__tlbi(r##op, addr); \
--
Without deviation from the norm, progress is not possible.
Powered by blists - more mailing lists