lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e95d97c98caa525b04cf6383a74db9cadf694afb.camel@icenowy.me>
Date:   Tue, 10 Oct 2023 08:50:21 +0800
From:   Icenowy Zheng <uwu@...nowy.me>
To:     Sui Jingfeng <suijingfeng@...ngson.cn>,
        Huacai Chen <chenhuacai@...nel.org>,
        WANG Xuerui <kernel@...0n.name>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Weihao Li <liweihao@...ngson.cn>,
        "Mike Rapoport (IBM)" <rppt@...nel.org>,
        Jun Yi <yijun@...ngson.cn>, Baoquan He <bhe@...hat.com>,
        "Matthew Wilcox (Oracle)" <willy@...radead.org>,
        David Hildenbrand <david@...hat.com>,
        Hongchen Zhang <zhanghongchen@...ngson.cn>,
        Binbin Zhou <zhoubinbin@...ngson.cn>,
        Zhen Lei <thunder.leizhen@...wei.com>,
        Tiezhu Yang <yangtiezhu@...ngson.cn>,
        Thomas Gleixner <tglx@...utronix.de>,
        Zhihong Dong <donmor3000@...mail.com>,
        loongarch@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] loongarch/mm: disable WUC for pgprot_writecombine as
 same as ioremap_wc

在 2023-10-09星期一的 22:32 +0800,Sui Jingfeng写道:
> Hi,
> 
> 
> On 2023/10/9 12:28, Icenowy Zheng wrote:
> > Currently the code disables WUC only disables it for ioremap_wc(),
> > which
> > is only used when mapping writecombine pages like ioremap() (mapped
> > to
> > the kernel space). For VRAM mapped in TTM/GEM, it's mapped with a
> > crafted pgprot with pgprot_writecombine() function, which isn't
> > corrently disabled now.
> > 
> > Disable WUC for pgprot_writecombine() (fallback to SUC) too.
> > 
> > This improves AMDGPU driver stability on Loongson 3A5000 machines.
> > 
> > Signed-off-by: Icenowy Zheng <uwu@...nowy.me>
> > ---
> > Changes since v1:
> > - Removed _WC macros
> > - Mention ioremap_wc in commit message
> > 
> >   arch/loongarch/include/asm/io.h           |  5 ++---
> >   arch/loongarch/include/asm/pgtable-bits.h |  4 +++-
> >   arch/loongarch/kernel/setup.c             | 10 +++++-----
> >   3 files changed, 10 insertions(+), 9 deletions(-)
> > 
> > diff --git a/arch/loongarch/include/asm/io.h
> > b/arch/loongarch/include/asm/io.h
> > index 0dcb36b32cb25..290aad87a8847 100644
> > --- a/arch/loongarch/include/asm/io.h
> > +++ b/arch/loongarch/include/asm/io.h
> > @@ -52,10 +52,9 @@ static inline void __iomem
> > *ioremap_prot(phys_addr_t offset, unsigned long size,
> >    * @offset:    bus address of the memory
> >    * @size:      size of the resource to map
> >    */
> > -extern pgprot_t pgprot_wc;
> > -
> >   #define ioremap_wc(offset, size)      \
> > -       ioremap_prot((offset), (size), pgprot_val(pgprot_wc))
> > +       ioremap_prot((offset), (size), pgprot_val( \
> > +               wc_enabled ? PAGE_KERNEL_WUC : PAGE_KERNEL_SUC))
> >   
> >   #define ioremap_cache(offset, size)   \
> >         ioremap_prot((offset), (size), pgprot_val(PAGE_KERNEL))
> > diff --git a/arch/loongarch/include/asm/pgtable-bits.h
> > b/arch/loongarch/include/asm/pgtable-bits.h
> > index 35348d4c4209a..a3d827701736d 100644
> > --- a/arch/loongarch/include/asm/pgtable-bits.h
> > +++ b/arch/loongarch/include/asm/pgtable-bits.h
> > @@ -92,6 +92,8 @@
> >   
> >   #ifndef __ASSEMBLY__
> >   
> > +extern bool wc_enabled;
> > +
> >   #define _PAGE_IOREMAP         pgprot_val(PAGE_KERNEL_SUC)
> >   
> >   #define pgprot_noncached pgprot_noncached
> > @@ -111,7 +113,7 @@ static inline pgprot_t
> > pgprot_writecombine(pgprot_t _prot)
> >   {
> >         unsigned long prot = pgprot_val(_prot);
> >   
> > -       prot = (prot & ~_CACHE_MASK) | _CACHE_WUC;
> > +       prot = (prot & ~_CACHE_MASK) | (wc_enabled ? _CACHE_WUC :
> > _CACHE_SUC);
> >   
> >         return __pgprot(prot);
> >   }
> > diff --git a/arch/loongarch/kernel/setup.c
> > b/arch/loongarch/kernel/setup.c
> > index 7783f0a3d742c..465c1dbb6f4b4 100644
> > --- a/arch/loongarch/kernel/setup.c
> > +++ b/arch/loongarch/kernel/setup.c
> > @@ -161,19 +161,19 @@ static void __init smbios_parse(void)
> >   }
> >   
> >   #ifdef CONFIG_ARCH_WRITECOMBINE
> > -pgprot_t pgprot_wc = PAGE_KERNEL_WUC;
> > +bool wc_enabled = true;
> >   #else
> > -pgprot_t pgprot_wc = PAGE_KERNEL_SUC;
> > +bool wc_enabled;
> >   #endif
> >   
> > -EXPORT_SYMBOL(pgprot_wc);
> > +EXPORT_SYMBOL(wc_enabled);
> >   
> >   static int __init setup_writecombine(char *p)
> >   {
> >         if (!strcmp(p, "on"))
> > -               pgprot_wc = PAGE_KERNEL_WUC;
> > +               wc_enabled = true;
> >         else if (!strcmp(p, "off"))
> > -               pgprot_wc = PAGE_KERNEL_SUC;
> > +               wc_enabled = false;
> >         else
> >                 pr_warn("Unknown writecombine setting \"%s\".\n",
> > p);
> >   
> 
> 
> Good catch!
> 
> But this will make the write combine(WC) mappings completely unusable
> on LoongArch.

Not completely unusable -- we always have the writecombine=on kernel
parameter that overrides it (or CONFIG_ARCH_WRITECOMBINE build-time
option).

People knowing what they're doing can utilize them to gain back the
performace of WUC, however for the default setup I prefer functionality
correctness to (unstable) performance.

> This is nearly equivalent to say that LoongArch don't support write
> combine at all.
> But the write combine(WC) mappings works fine for software based drm
> drivers,
> such as drm/loongson and drm/ast etc. Even include drm/radeon and
> drm/amdgpu with
> pure software rendering setup (by putting Option "Accel" "off" into
> 10-amdgpu.conf
> or 10-radeon.conf) After merge this patch, the performance drop
> dramatically for
> 2D software rendering based display controller drivers.
> 
> Well, this patch itself is a good catch, as it is a fix for the
> commit <16c52e503043>
> ("LoongArch: Make WriteCombine configurable for ioremap()"). But I'm
> afraid that
> both of this commit and the <16c52e503043> commit are not a *real*
> fix write combine

Yes, this is a workaround rather than a fix, however I'm a software
developer and this is the best I can do.

If you can, please report this to your hardware developers and try to
get it fixed for 3A7000/7A3000 (or maybe grabbing some chicken bit that
will fix this for existing chips). I will be quite appreciated.

Thanks,
Icenowy

> related issue on LoongArch. It just negative sidestep of the real
> problem.
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ