[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4f1af31b-15be-cb47-6b34-45de1b5696be@loongson.cn>
Date: Mon, 9 Oct 2023 22:32:44 +0800
From: Sui Jingfeng <suijingfeng@...ngson.cn>
To: Icenowy Zheng <uwu@...nowy.me>,
Huacai Chen <chenhuacai@...nel.org>,
WANG Xuerui <kernel@...0n.name>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Weihao Li <liweihao@...ngson.cn>,
"Mike Rapoport (IBM)" <rppt@...nel.org>,
Jun Yi <yijun@...ngson.cn>, Baoquan He <bhe@...hat.com>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
David Hildenbrand <david@...hat.com>,
Hongchen Zhang <zhanghongchen@...ngson.cn>,
Binbin Zhou <zhoubinbin@...ngson.cn>,
Zhen Lei <thunder.leizhen@...wei.com>,
Tiezhu Yang <yangtiezhu@...ngson.cn>,
Thomas Gleixner <tglx@...utronix.de>,
Zhihong Dong <donmor3000@...mail.com>,
loongarch@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] loongarch/mm: disable WUC for pgprot_writecombine as
same as ioremap_wc
Hi,
On 2023/10/9 12:28, Icenowy Zheng wrote:
> Currently the code disables WUC only disables it for ioremap_wc(), which
> is only used when mapping writecombine pages like ioremap() (mapped to
> the kernel space). For VRAM mapped in TTM/GEM, it's mapped with a
> crafted pgprot with pgprot_writecombine() function, which isn't
> corrently disabled now.
>
> Disable WUC for pgprot_writecombine() (fallback to SUC) too.
>
> This improves AMDGPU driver stability on Loongson 3A5000 machines.
>
> Signed-off-by: Icenowy Zheng <uwu@...nowy.me>
> ---
> Changes since v1:
> - Removed _WC macros
> - Mention ioremap_wc in commit message
>
> arch/loongarch/include/asm/io.h | 5 ++---
> arch/loongarch/include/asm/pgtable-bits.h | 4 +++-
> arch/loongarch/kernel/setup.c | 10 +++++-----
> 3 files changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/arch/loongarch/include/asm/io.h b/arch/loongarch/include/asm/io.h
> index 0dcb36b32cb25..290aad87a8847 100644
> --- a/arch/loongarch/include/asm/io.h
> +++ b/arch/loongarch/include/asm/io.h
> @@ -52,10 +52,9 @@ static inline void __iomem *ioremap_prot(phys_addr_t offset, unsigned long size,
> * @offset: bus address of the memory
> * @size: size of the resource to map
> */
> -extern pgprot_t pgprot_wc;
> -
> #define ioremap_wc(offset, size) \
> - ioremap_prot((offset), (size), pgprot_val(pgprot_wc))
> + ioremap_prot((offset), (size), pgprot_val( \
> + wc_enabled ? PAGE_KERNEL_WUC : PAGE_KERNEL_SUC))
>
> #define ioremap_cache(offset, size) \
> ioremap_prot((offset), (size), pgprot_val(PAGE_KERNEL))
> diff --git a/arch/loongarch/include/asm/pgtable-bits.h b/arch/loongarch/include/asm/pgtable-bits.h
> index 35348d4c4209a..a3d827701736d 100644
> --- a/arch/loongarch/include/asm/pgtable-bits.h
> +++ b/arch/loongarch/include/asm/pgtable-bits.h
> @@ -92,6 +92,8 @@
>
> #ifndef __ASSEMBLY__
>
> +extern bool wc_enabled;
> +
> #define _PAGE_IOREMAP pgprot_val(PAGE_KERNEL_SUC)
>
> #define pgprot_noncached pgprot_noncached
> @@ -111,7 +113,7 @@ static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
> {
> unsigned long prot = pgprot_val(_prot);
>
> - prot = (prot & ~_CACHE_MASK) | _CACHE_WUC;
> + prot = (prot & ~_CACHE_MASK) | (wc_enabled ? _CACHE_WUC : _CACHE_SUC);
>
> return __pgprot(prot);
> }
> diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
> index 7783f0a3d742c..465c1dbb6f4b4 100644
> --- a/arch/loongarch/kernel/setup.c
> +++ b/arch/loongarch/kernel/setup.c
> @@ -161,19 +161,19 @@ static void __init smbios_parse(void)
> }
>
> #ifdef CONFIG_ARCH_WRITECOMBINE
> -pgprot_t pgprot_wc = PAGE_KERNEL_WUC;
> +bool wc_enabled = true;
> #else
> -pgprot_t pgprot_wc = PAGE_KERNEL_SUC;
> +bool wc_enabled;
> #endif
>
> -EXPORT_SYMBOL(pgprot_wc);
> +EXPORT_SYMBOL(wc_enabled);
>
> static int __init setup_writecombine(char *p)
> {
> if (!strcmp(p, "on"))
> - pgprot_wc = PAGE_KERNEL_WUC;
> + wc_enabled = true;
> else if (!strcmp(p, "off"))
> - pgprot_wc = PAGE_KERNEL_SUC;
> + wc_enabled = false;
> else
> pr_warn("Unknown writecombine setting \"%s\".\n", p);
>
Good catch!
But this will make the write combine(WC) mappings completely unusable on LoongArch.
This is nearly equivalent to say that LoongArch don't support write combine at all.
But the write combine(WC) mappings works fine for software based drm drivers,
such as drm/loongson and drm/ast etc. Even include drm/radeon and drm/amdgpu with
pure software rendering setup (by putting Option "Accel" "off" into 10-amdgpu.conf
or 10-radeon.conf) After merge this patch, the performance drop dramatically for
2D software rendering based display controller drivers.
Well, this patch itself is a good catch, as it is a fix for the commit <16c52e503043>
("LoongArch: Make WriteCombine configurable for ioremap()"). But I'm afraid that
both of this commit and the <16c52e503043> commit are not a *real* fix write combine
related issue on LoongArch. It just negative sidestep of the real problem.
Powered by blists - more mailing lists