[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z0dhc-DtVsvufv-E@arm.com>
Date: Wed, 27 Nov 2024 18:14:11 +0000
From: Catalin Marinas <catalin.marinas@....com>
To: Yang Shi <yang@...amperecomputing.com>
Cc: Sasha Levin <sashal@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Will Deacon <will@...nel.org>, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, David Hildenbrand <david@...hat.com>
Subject: Re: [GIT PULL] arm64 updates for 6.13-rc1
On Tue, Nov 26, 2024 at 09:41:39AM -0800, Yang Shi wrote:
> On 11/25/24 11:09 AM, Catalin Marinas wrote:
> > On Mon, Nov 25, 2024 at 10:09:59AM -0500, Sasha Levin wrote:
> > > On Mon, Nov 18, 2024 at 10:06:23AM +0000, Catalin Marinas wrote:
> > > > - MTE: hugetlbfs support and the corresponding kselftests
> > >
> > > It looks like with the new feature above, LTP manages to trigger the
> > > following warning on linus-next:
> > >
> > > [ 100.133691] hugefork01 (362): drop_caches: 3
> > > tst_hugepage.c:84: TINFO: 2 hugepage(s) reserved
> > > tst_tmpdir.c:316: TINFO: Using /scratch/ltp-CckaqgMrC1/LTP_hug5PSMw8 as tmpdir (ext2/ext3/ext4 filesystem)
> > > tst_test.c:1085: TINFO: Mounting none to /scratch/ltp-CckaqgMrC1/LTP_hug5PSMw8/hugetlbfs fstyp=hugetlbfs flags=0
> > > tst_test.c:1860: TINFO: LTP version: 20240930
> > > tst_test.c:1864: TINFO: Tested kernel: 6.12.0 #1 SMP PREEMPT @1732504538 aarch64
> > > tst_test.c:1703: TINFO: Timeout per run is 0h 02m 30s
> > > <4>[ 100.355230] ------------[ cut here ]------------
> > > <4>[ 100.356888] WARNING: CPU: 0 PID: 363 at arch/arm64/include/asm/mte.h:58 copy_highpage+0x1d4/0x2d8
> > > <4>[ 100.359160] Modules linked in: crct10dif_ce sm3_ce sm3 sha3_ce sha512_ce sha512_arm64 fuse drm backlight ip_tables x_tables
> > > <4>[ 100.363578] CPU: 0 UID: 0 PID: 363 Comm: hugefork01 Not tainted 6.12.0 #1
> > > <4>[ 100.365113] Hardware name: linux,dummy-virt (DT)
> > > <4>[ 100.365966] pstate: 63402009 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
> > > <4>[ 100.366468] pc : copy_highpage+0x1d4/0x2d8
> > > <4>[ 100.366780] lr : copy_highpage+0x78/0x2d8
> > > <4>[ 100.367090] sp : ffff80008066bb30
> > > <4>[ 100.368094] x29: ffff80008066bb30 x28: ffffc1ffc3118000 x27: 0000000000000000
> > > <4>[ 100.369341] x26: 0000000000000000 x25: 0000ffff9ce00000 x24: ffffc1ffc3118000
> > > <4>[ 100.370223] x23: fff00000c47ff000 x22: fff00000c4fff000 x21: ffffc1ffc3138000
> > > <4>[ 100.370739] x20: ffffc1ffc3138000 x19: ffffc1ffc311ffc0 x18: ffffffffffffffff
> > > <4>[ 100.371285] x17: 0000000000000000 x16: ffffa302fd05bcb0 x15: 0000ffff9d2fdfff
> > > <4>[ 100.372778] x14: 0000000000000000 x13: 1ffe00001859f161 x12: fff00000c2cf8b0c
> > > <4>[ 100.374124] x11: ffff80008066bd70 x10: ffffa302fe2a20d0 x9 : ffffa302fb438578
> > > <4>[ 100.374877] x8 : ffff80008066ba48 x7 : 0000000000000000 x6 : ffffa302fdbdf000
> > > <4>[ 100.376152] x5 : 0000000000000000 x4 : fff00000c2f239c0 x3 : fff00000c33e43f0
> > > <4>[ 100.376962] x2 : ffffc1ffc3138000 x1 : 00000000000000f4 x0 : 0000000000000000
> > > <4>[ 100.377964] Call trace:
> > > <4>[ 100.378736] copy_highpage+0x1d4/0x2d8 (P)
> > > <4>[ 100.379422] copy_highpage+0x78/0x2d8 (L)
> > > <4>[ 100.380272] copy_user_highpage+0x20/0x48
> > > <4>[ 100.380805] copy_user_large_folio+0x1bc/0x268
> > > <4>[ 100.381601] hugetlb_wp+0x190/0x860
> > > <4>[ 100.382031] hugetlb_fault+0xa28/0xc10
> > > <4>[ 100.382911] handle_mm_fault+0x2a0/0x2c0
> > > <4>[ 100.383511] do_page_fault+0x12c/0x578
> > > <4>[ 100.384913] do_mem_abort+0x4c/0xa8
> > > <4>[ 100.385397] el0_da+0x44/0xb0
> > > <4>[ 100.385775] el0t_64_sync_handler+0xc4/0x138
> > > <4>[ 100.386243] el0t_64_sync+0x198/0x1a0
> > > <4>[ 100.388759] ---[ end trace 0000000000000000 ]---
> >
> > It looks like this can trigger even if the system does not use MTE. The
> > warning was introduced in commit 25c17c4b55de ("hugetlb: arm64: add mte
> > support") and it's supposed to check whether page_mte_tagged() is called
> > on a large folio inadvertently. But in copy_highpage(), if the source is
> > a huge page and untagged, it takes the else path with the
> > page_mte_tagged() check. I think something like below would do but I
> > haven't tried it yet:
>
> Thanks for investigating this. Yes, it is. The fix looks correct to me.
>
> >
> > diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
> > index 87b3f1a25535..ef303a2262c5 100644
> > --- a/arch/arm64/mm/copypage.c
> > +++ b/arch/arm64/mm/copypage.c
> > @@ -30,9 +30,9 @@ void copy_highpage(struct page *to, struct page *from)
> > if (!system_supports_mte())
> > return;
> > - if (folio_test_hugetlb(src) &&
> > - folio_test_hugetlb_mte_tagged(src)) {
> > - if (!folio_try_hugetlb_mte_tagging(dst))
> > + if (folio_test_hugetlb(src)) {
> > + if (!folio_test_hugetlb_mte_tagged(src) ||
> > + !folio_try_hugetlb_mte_tagging(dst))
> > return;
> > /*
I wonder why we had a 'return' here originally rather than a
WARN_ON_ONCE() as we do further down for the page case. Do you seen any
issue with the hunk below? Destination should be a new folio and not
tagged yet:
diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
index 87b3f1a25535..cc7dfbea1304 100644
--- a/arch/arm64/mm/copypage.c
+++ b/arch/arm64/mm/copypage.c
@@ -30,11 +30,12 @@ void copy_highpage(struct page *to, struct page *from)
if (!system_supports_mte())
return;
- if (folio_test_hugetlb(src) &&
- folio_test_hugetlb_mte_tagged(src)) {
- if (!folio_try_hugetlb_mte_tagging(dst))
+ if (folio_test_hugetlb(src)) {
+ if (!folio_test_hugetlb_mte_tagged(src))
return;
+ WARN_ON_ONCE(!folio_try_hugetlb_mte_tagging(dst));
+
/*
* Populate tags for all subpages.
*
--
Catalin
Powered by blists - more mailing lists