lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 30 Aug 2022 22:17:31 -0600
From:   Yu Zhao <yuzhao@...gle.com>
To:     linux-mm@...ck.org
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Andi Kleen <ak@...ux.intel.com>,
        Aneesh Kumar <aneesh.kumar@...ux.ibm.com>,
        Catalin Marinas <catalin.marinas@....com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Hillf Danton <hdanton@...a.com>, Jens Axboe <axboe@...nel.dk>,
        Johannes Weiner <hannes@...xchg.org>,
        Jonathan Corbet <corbet@....net>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Matthew Wilcox <willy@...radead.org>,
        Mel Gorman <mgorman@...e.de>,
        Michael Larabel <Michael@...haellarabel.com>,
        Michal Hocko <mhocko@...nel.org>,
        Mike Rapoport <rppt@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Tejun Heo <tj@...nel.org>, Vlastimil Babka <vbabka@...e.cz>,
        Will Deacon <will@...nel.org>,
        linux-arm-kernel@...ts.infradead.org, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, x86@...nel.org,
        page-reclaim@...gle.com
Subject: OpenWrt / MIPS benchmark with MGLRU

TLDR
====
RAM utilization  Throughput (95% CI)  P99 Latency (95% CI)
----------------------------------------------------------
~90%             NS                   NS
~110%            +[12, 16]%           -[20, 22]%

Abbreviations
=============
CI:   confidence interval
NS:   no statistically significant difference
DUT:  device under test
ATE:  automatic test equipment

Rational
========
1. OpenWrt is the most popular distro for WiFi routers; many of its
   targets use big endianness [1].
2. 4 out of the top 5 bestselling WiFi routers in the US use MIPS [2];
   MIPS uses software-managed TLB.
3. Memcached is the best available memory benchmark on OpenWrt;
   admittedly such a use case is very limited in the real world.

Hardware
========
DUT: Ubiquiti EdgeRouter (ER-8) [3]

DUT # cat /proc/cpuinfo
system type             : UBNT_E200 (CN6120p1.1-800-NSP)
machine                 : Unknown
processor               : 0
cpu model               : Cavium Octeon II V0.1
BogoMIPS                : 1600.00
wait instruction        : yes
microsecond timers      : yes
tlb_entries             : 128
extra interrupt vector  : yes
hardware watchpoint     : yes, count: 2, address/irw mask: [0x0ffc, 0x0ffb]
isa                     : mips1 mips2 mips3 mips4 mips5 mips32r1 mips32r2 mips64r1 mips64r2
ASEs implemented        :
Options implemented     : tlb rixiex 4kex octeon_cache 32fpr prefetch mcheck ejtag llsc rixi lpa vtag_icache userlocal perf_cntr_intr_bit perf
shadow register sets    : 1
kscratch registers      : 3
package                 : 0
core                    : 0
VCED exceptions         : not available
VCEI exceptions         : not available

processor               : 1
cpu model               : Cavium Octeon II V0.1
BogoMIPS                : 1600.00
wait instruction        : yes
microsecond timers      : yes
tlb_entries             : 128
extra interrupt vector  : yes
hardware watchpoint     : yes, count: 2, address/irw mask: [0x0ffc, 0x0ffb]
isa                     : mips1 mips2 mips3 mips4 mips5 mips32r1 mips32r2 mips64r1 mips64r2
ASEs implemented        :
Options implemented     : tlb rixiex 4kex octeon_cache 32fpr prefetch mcheck ejtag llsc rixi lpa vtag_icache userlocal perf_cntr_intr_bit perf
shadow register sets    : 1
kscratch registers      : 3
package                 : 0
core                    : 1
VCED exceptions         : not available
VCEI exceptions         : not available

DUT # cat /proc/meminfo
MemTotal:        1991964 kB
MemFree:         1917304 kB
MemAvailable:    1896856 kB
Buffers:               4 kB
Cached:            33464 kB
SwapCached:            0 kB
Active:             1316 kB
Inactive:          33500 kB
Active(anon):       1316 kB
Inactive(anon):    33496 kB
Active(file):          0 kB
Inactive(file):        4 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:        995324 kB
SwapFree:         995324 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:          1360 kB
Mapped:             2688 kB
Shmem:             33464 kB
KReclaimable:       8244 kB
Slab:              19772 kB
SReclaimable:       8244 kB
SUnreclaim:        11528 kB
KernelStack:        1056 kB
PageTables:          336 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     1991304 kB
Committed_AS:      38916 kB
VmallocTotal: 1069547512 kB
VmallocUsed:        4856 kB
VmallocChunk:          0 kB
Percpu:              272 kB

Software
========
DUT # cat /etc/openwrt_release
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='22.03.0-rc6'
DISTRIB_REVISION='r19590-042d558536'
DISTRIB_TARGET='octeon/generic'
DISTRIB_ARCH='mips64_octeonplus'
DISTRIB_DESCRIPTION='OpenWrt 22.03.0-rc6 r19590-042d558536'
DISTRIB_TAINTS='no-all no-ipv6'

DUT # uname -a
Linux OpenWrt 6.0.0-rc3+ #0 SMP Sun Jul 31 15:12:47 2022 mips64 GNU/Linux

DUT # cat /proc/swaps
Filename    Type       Size    Used  Priority
/dev/zram0  partition  995324  0     100

DUT # memcached -V
memcached 1.6.9

DUT # cat /etc/config/memcached
config memcached
        option user 'memcached'
        option maxconn '1024'
        option listen '0.0.0.0'
        option port '11211'
        option memory '6400'

ATE $ memtier_benchmark -v
memtier_benchmark 1.3.0
Copyright (C) 2011-2022 Redis Ltd.
This is free software.  You may redistribute copies of it under the terms of
the GNU General Public License <http://www.gnu.org/licenses/gpl.html>.
There is NO WARRANTY, to the extent permitted by law.

Procedure
=========
ATE $ cat run_benchmark_matrix.sh
run_memtier_benchmark()
{
    # boot to kernel $3

    # populate dataset
    memtier_benchmark/memtier_benchmark -s $DUT_IP -p 11211 \
        -P memcache_binary -n allkeys -c 1 --ratio 1:0 --pipeline 8 \
        --key-minimum=1 --key-maximum=$2 --key-pattern=P:P \
        -d 1000

    # access dataset using Guassian pattern
    memtier_benchmark/memtier_benchmark -s $DUT_IP -p 11211 \
        -P memcache_binary --test-time $1 -c 1 --ratio 0:1 \
        --pipeline 8 --key-minimum=1 --key-maximum=$2 \
        --key-pattern=G:G --randomize --distinct-client-seed

    # collect results
}

run_duration_secs=1200
mem_utils_90_110=(1600000 2000000)
kernels=("baseline" "patched")

for mem_util in ${mem_utils_90_110[@]}; do
    for kernel in ${kernels[@]}; do
        run_memtier_benchmark $run_duration_secs $mem_util $kernel
    done
done

Results
=======
Baseline                                 90% RAM utilization
------------------------------------------------------------
Ops/sec   Avg. Lat.  p50 Lat.  p99 Lat.  p99.9 Lat.  KB/sec
------------------------------------------------------------
48550.71  0.65687    0.48700   2.84700   5.56700     1812.25
48600.55  0.65629    0.48700   2.86300   5.59900     1814.11
48562.37  0.65674    0.48700   2.84700   5.50300     1812.68
48556.66  0.65688    0.48700   2.84700   5.53500     1812.47
48619.50  0.65600    0.48700   2.87900   5.63100     1814.82
48579.74  0.65654    0.48700   2.84700   5.56700     1813.33
48593.25  0.65764    0.48700   2.86300   5.56700     1814.10
48535.52  0.65716    0.48700   2.86300   5.56700     1811.68
48587.24  0.65645    0.48700   2.83100   5.50300     1813.61
48541.92  0.65704    0.48700   2.81500   5.47100     1811.92

MGLRU                                    90% RAM utilization
------------------------------------------------------------
Ops/sec   Avg. Lat.  p50 Lat.  p99 Lat.  p99.9 Lat.  KB/sec
------------------------------------------------------------
48622.38  0.65594    0.48700   2.81500   5.47100     1814.92
48537.74  0.65715    0.48700   2.84700   5.53500     1811.76
48586.82  0.65646    0.48700   2.84700   5.50300     1813.59
48552.44  0.65695    0.48700   2.83100   5.43900     1812.31
48557.35  0.65680    0.49500   2.83100   5.53500     1812.49
48625.48  0.65593    0.48700   2.81500   5.43900     1815.04
48655.75  0.65557    0.48700   2.84700   5.53500     1816.17
48625.67  0.65595    0.48700   2.84700   5.53500     1815.04
48622.22  0.65600    0.48700   2.84700   5.47100     1814.91
48617.10  0.65610    0.48700   2.84700   5.56700     1814.73

Baseline                                110% RAM utilization
------------------------------------------------------------
Ops/sec   Avg. Lat.  p50 Lat.  p99 Lat.  p99.9 Lat.  KB/sec
------------------------------------------------------------
19813.79  1.61245    0.63100   17.79100  31.74300    744.91
20328.29  1.57158    0.62300   17.27900  31.10300    764.25
20104.12  1.58913    0.62300   17.40700  31.10300    755.82
20342.03  1.57053    0.61500   17.27900  30.84700    764.77
19688.05  1.62268    0.62300   17.91900  31.35900    740.18
19607.31  1.62943    0.63900   17.91900  31.23100    737.15
19250.96  1.65963    0.65500   17.91900  31.10300    723.75
20182.79  1.58290    0.63100   17.40700  30.84700    758.78
20181.88  1.58299    0.63100   17.40700  30.84700    758.75
20615.90  1.54963    0.62300   17.02300  30.84700    775.06

MGLRU                                   110% RAM utilization
------------------------------------------------------------
Ops/sec   Avg. Lat.  p50 Lat.  p99 Lat.  p99.9 Lat.  KB/sec
------------------------------------------------------------
22911.33  1.39405    0.61500   13.69500  28.79900    861.36
22339.08  1.42989    0.61500   14.07900  30.07900    839.85
23394.22  1.36521    0.59900   13.56700  29.05500    879.51
22521.48  1.41830    0.61500   13.88700  29.82300    846.70
22678.10  1.40818    0.61500   13.82300  29.69500    852.59
22344.50  1.42952    0.61500   14.07900  29.95100    840.05
23245.65  1.37406    0.60700   13.50300  28.92700    873.93
23140.17  1.38032    0.59900   13.69500  29.18300    869.96
23003.34  1.38856    0.61500   13.63100  29.05500    864.82
22937.52  1.39253    0.61500   13.69500  29.43900    862.35

Flame graphs
------------
Baseline: https://drive.google.com/file/d/1-Ac4HMPAyZIqxtvKerUTqNNAgBLhpX9R
MGLRU: https://drive.google.com/file/d/1-9x0W2yIYeiRvXWiYRzL6niTqW7zCVPX

References
==========
[1] https://openwrt.org/docs/platforms/start
[2] https://www.amazon.com/bestsellers/pc/300189
[3] https://openwrt.org/toh/ubiquiti/edgerouter

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ