lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 28 Sep 2019 04:23:28 +0000
From:   Atish Patra <Atish.Patra@....com>
To:     "paul.walmsley@...ive.com" <paul.walmsley@...ive.com>
CC:     "hch@....de" <hch@....de>,
        "linux-riscv@...ts.infradead.org" <linux-riscv@...ts.infradead.org>,
        Damien Le Moal <Damien.LeMoal@....com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Anup Patel <Anup.Patel@....com>,
        "aou@...s.berkeley.edu" <aou@...s.berkeley.edu>,
        "palmer@...ive.com" <palmer@...ive.com>
Subject: Re: [PATCH v4 0/3] Optimize tlbflush path

On Fri, 2019-08-30 at 19:50 -0700, Paul Walmsley wrote:
> Hi Atish,
> 
> On Thu, 22 Aug 2019, Atish Patra wrote:
> 
> > This series adds few optimizations to reduce the trap cost in the
> > tlb
> > flush path. We should only make SBI calls to remote tlb flush only
> > if
> > absolutely required. 
> 
> The patches look great.  My understanding is that these optimization 
> patches may actually be a partial workaround for the TLB flushing bug
> that 
> we've been looking at for the last month or so, which can corrupt
> memory 
> or crash the system.
> 
> If that's the case, let's first root-cause the underlying
> bug.  Otherwise 
> we'll just be papering over the actual issue, which probably could
> still 
> occur even with this series, correct?  Since it contains no explicit 
> fixes?
> 
> 
I have verified the glibc locale install issue both in Qemu and
Unleashed. I don't see any issue with OpenSBI master + Linux v5.3
kernel.

As per our investigation, it looks like a hardware errata with
Unleashed board as the memory corruption issue only happens in case of
tlb range flush. In RISC-V, sfence.vma can only be issued at page
boundary. If the range is larger than that, OpenSBI has to issue
multiple sfence.vma calls back to back leading to possible memory
corruption.

Currently, OpenSBI has a platform feature i.e. "tlb_range_flush_limit"
that allows to configure tlb flush threshold per platform. Any tlb
flush range request greater than this threshold, is converted to a full
flush. Currently, it is set to the default value 4K for every
platform[1]. Glibc locale install memory corruption only happens if
this threshold is changed to a higher value i.e. 1G. This doesn't
change anything in OpenSBI code path except the fact that it will issue
many sfence.vma instructions back to back instead of one. 

If the hardware team at SiFive can look into this as well, it would be
great.

To conclude, we think this issue need to be investigated by hardware
team and the kernel patch can be merged to get the performance benefit.

[1] 
https://github.com/riscv/opensbi/blob/master/include/sbi/sbi_platform.h#L40



> - Paul
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@...ts.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

-- 
Regards,
Atish

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ