lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 7 Jun 2022 12:24:52 +0200
From:   Petr Malat <oss@...at.biz>
To:     David Laight <David.Laight@...lab.com>
Cc:     "linux-mtd@...ts.infradead.org" <linux-mtd@...ts.infradead.org>,
        Joern Engel <joern@...ybastard.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH] mtd: phram: Map RAM using memremap instead of ioremap

Hi!

On Mon, May 23, 2022 at 04:09:20PM +0000, David Laight wrote:
> On x86 (which I know a lot more about) memcpy() has a nasty
> habit of getting implemented as 'rep movsb' relying on the
> cpu to speed it up.
> But that doesn't happen for uncached addresses - so you get
> very slow byte copies.

I have measured the performance with (patched) and without my
change (orig). My change improves the performance on X8664 and
arm. On Mips64 it stays the same:

Tests
=====
All runtimes are in milliseconds, average real-time of 3 runs, time
measured with bash time built-in. Measured process run in SCHED_FIFO
with priority 99. Page cache was flushed before every run, but all
involved program images were in tmpfs (no swap).
 - dd r512
   dd if=/dev/TESTDEV of=/dev/null  bs=512
 - dd r1MB
   dd if=/dev/TESTDEV of=/dev/null  bs=1M
 - dd r512
   dd of=/dev/TESTDEV if=/tmpfs/img bs=512
 - dd r1MB
   dd of=/dev/TESTDEV if=/tmpfs/img bs=1M
 - flashcp
   flashcp /tmpfs/img /dev/TESTDEV
 - flasherase
   flash_eraseall -q /dev/TESTDEV

Results
=======
All times are in ms

ARCH       |     MIPS64      |       ARM       |     X8664
CPU        |   CN6335p2.2    |    v7 TI K2     |  Xeon D-1548
Dev. size  |      32MB       |      128MB      |     256MB
-----------+-------+---------+-------+---------+-------+---------
     in ms |  Orig | Patched |  Orig | Patched |  Orig | Patched
dd r512    |   131 |     130 |  1101 |     543 | 22906 |     281
dd r1MB    |    65 |      65 |   655 |     122 | 22715 |      70
dd w512    |  1150 |    1150 |  1136 |    1042 | 28067 |     412
dd w1MB    |   104 |     104 |   396 |     244 | 27761 |     122
flashcp    |   100 |      99 |  1438 |     568 | 78455 |     270
flasherase |    21 |      21 |   208 |      77 | 27707 |      57

BR,
  Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ