[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0c09425b-c8ba-4ed6-b429-0bce4e7d00e9@os.amperecomputing.com>
Date: Tue, 26 Nov 2024 09:41:39 -0800
From: Yang Shi <yang@...amperecomputing.com>
To: Catalin Marinas <catalin.marinas@....com>, Sasha Levin <sashal@...nel.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Will Deacon <will@...nel.org>, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, David Hildenbrand <david@...hat.com>
Subject: Re: [GIT PULL] arm64 updates for 6.13-rc1
On 11/25/24 11:09 AM, Catalin Marinas wrote:
> Thanks Sasha.
>
> Adding Yang Shi (he contributed the support) and David H.
>
> On Mon, Nov 25, 2024 at 10:09:59AM -0500, Sasha Levin wrote:
>> On Mon, Nov 18, 2024 at 10:06:23AM +0000, Catalin Marinas wrote:
>>> - MTE: hugetlbfs support and the corresponding kselftests
>> Hi Catalin,
>>
>> It looks like with the new feature above, LTP manages to trigger the
>> following warning on linus-next:
>>
>> [ 100.133691] hugefork01 (362): drop_caches: 3
>> tst_hugepage.c:84: TINFO: 2 hugepage(s) reserved
>> tst_tmpdir.c:316: TINFO: Using /scratch/ltp-CckaqgMrC1/LTP_hug5PSMw8 as tmpdir (ext2/ext3/ext4 filesystem)
>> tst_test.c:1085: TINFO: Mounting none to /scratch/ltp-CckaqgMrC1/LTP_hug5PSMw8/hugetlbfs fstyp=hugetlbfs flags=0
>> tst_test.c:1860: TINFO: LTP version: 20240930
>> tst_test.c:1864: TINFO: Tested kernel: 6.12.0 #1 SMP PREEMPT @1732504538 aarch64
>> tst_test.c:1703: TINFO: Timeout per run is 0h 02m 30s
>> <4>[ 100.355230] ------------[ cut here ]------------
>> <4>[ 100.356888] WARNING: CPU: 0 PID: 363 at arch/arm64/include/asm/mte.h:58 copy_highpage+0x1d4/0x2d8
>> <4>[ 100.359160] Modules linked in: crct10dif_ce sm3_ce sm3 sha3_ce sha512_ce sha512_arm64 fuse drm backlight ip_tables x_tables
>> <4>[ 100.363578] CPU: 0 UID: 0 PID: 363 Comm: hugefork01 Not tainted 6.12.0 #1
>> <4>[ 100.365113] Hardware name: linux,dummy-virt (DT)
>> <4>[ 100.365966] pstate: 63402009 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
>> <4>[ 100.366468] pc : copy_highpage+0x1d4/0x2d8
>> <4>[ 100.366780] lr : copy_highpage+0x78/0x2d8
>> <4>[ 100.367090] sp : ffff80008066bb30
>> <4>[ 100.368094] x29: ffff80008066bb30 x28: ffffc1ffc3118000 x27: 0000000000000000
>> <4>[ 100.369341] x26: 0000000000000000 x25: 0000ffff9ce00000 x24: ffffc1ffc3118000
>> <4>[ 100.370223] x23: fff00000c47ff000 x22: fff00000c4fff000 x21: ffffc1ffc3138000
>> <4>[ 100.370739] x20: ffffc1ffc3138000 x19: ffffc1ffc311ffc0 x18: ffffffffffffffff
>> <4>[ 100.371285] x17: 0000000000000000 x16: ffffa302fd05bcb0 x15: 0000ffff9d2fdfff
>> <4>[ 100.372778] x14: 0000000000000000 x13: 1ffe00001859f161 x12: fff00000c2cf8b0c
>> <4>[ 100.374124] x11: ffff80008066bd70 x10: ffffa302fe2a20d0 x9 : ffffa302fb438578
>> <4>[ 100.374877] x8 : ffff80008066ba48 x7 : 0000000000000000 x6 : ffffa302fdbdf000
>> <4>[ 100.376152] x5 : 0000000000000000 x4 : fff00000c2f239c0 x3 : fff00000c33e43f0
>> <4>[ 100.376962] x2 : ffffc1ffc3138000 x1 : 00000000000000f4 x0 : 0000000000000000
>> <4>[ 100.377964] Call trace:
>> <4>[ 100.378736] copy_highpage+0x1d4/0x2d8 (P)
>> <4>[ 100.379422] copy_highpage+0x78/0x2d8 (L)
>> <4>[ 100.380272] copy_user_highpage+0x20/0x48
>> <4>[ 100.380805] copy_user_large_folio+0x1bc/0x268
>> <4>[ 100.381601] hugetlb_wp+0x190/0x860
>> <4>[ 100.382031] hugetlb_fault+0xa28/0xc10
>> <4>[ 100.382911] handle_mm_fault+0x2a0/0x2c0
>> <4>[ 100.383511] do_page_fault+0x12c/0x578
>> <4>[ 100.384913] do_mem_abort+0x4c/0xa8
>> <4>[ 100.385397] el0_da+0x44/0xb0
>> <4>[ 100.385775] el0t_64_sync_handler+0xc4/0x138
>> <4>[ 100.386243] el0t_64_sync+0x198/0x1a0
>> <4>[ 100.388759] ---[ end trace 0000000000000000 ]---
> It looks like this can trigger even if the system does not use MTE. The
> warning was introduced in commit 25c17c4b55de ("hugetlb: arm64: add mte
> support") and it's supposed to check whether page_mte_tagged() is called
> on a large folio inadvertently. But in copy_highpage(), if the source is
> a huge page and untagged, it takes the else path with the
> page_mte_tagged() check. I think something like below would do but I
> haven't tried it yet:
Hi Catalin,
Thanks for investigating this. Yes, it is. The fix looks correct to me.
>
> diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
> index 87b3f1a25535..ef303a2262c5 100644
> --- a/arch/arm64/mm/copypage.c
> +++ b/arch/arm64/mm/copypage.c
> @@ -30,9 +30,9 @@ void copy_highpage(struct page *to, struct page *from)
> if (!system_supports_mte())
> return;
>
> - if (folio_test_hugetlb(src) &&
> - folio_test_hugetlb_mte_tagged(src)) {
> - if (!folio_try_hugetlb_mte_tagging(dst))
> + if (folio_test_hugetlb(src)) {
> + if (!folio_test_hugetlb_mte_tagged(src) ||
> + !folio_try_hugetlb_mte_tagging(dst))
> return;
>
> /*
>
Powered by blists - more mailing lists