linux-kernel - Re: kvm/arm64: Spark benchmark

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAOUHufaWbWZ-q-PUJnjXD_jDk1s34mcg4vHU8CtAtmeAT-deRA@mail.gmail.com>
Date:   Sun, 18 Jun 2023 14:11:11 -0600
From:   Yu Zhao <yuzhao@...gle.com>
To:     Marc Zyngier <maz@...nel.org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Alistair Popple <apopple@...dia.com>,
        Anup Patel <anup@...infault.org>,
        Ben Gardon <bgardon@...gle.com>,
        Borislav Petkov <bp@...en8.de>,
        Catalin Marinas <catalin.marinas@....com>,
        Chao Peng <chao.p.peng@...ux.intel.com>,
        Christophe Leroy <christophe.leroy@...roup.eu>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Fabiano Rosas <farosas@...ux.ibm.com>,
        Gaosheng Cui <cuigaosheng1@...wei.com>,
        Gavin Shan <gshan@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
        James Morse <james.morse@....com>,
        "Jason A. Donenfeld" <Jason@...c4.com>,
        Jason Gunthorpe <jgg@...pe.ca>,
        Jonathan Corbet <corbet@....net>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        Michael Larabel <michael@...haellarabel.com>,
        Mike Rapoport <rppt@...nel.org>,
        Nicholas Piggin <npiggin@...il.com>,
        Oliver Upton <oliver.upton@...ux.dev>,
        Paul Mackerras <paulus@...abs.org>,
        Peter Xu <peterx@...hat.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Suzuki K Poulose <suzuki.poulose@....com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Thomas Huth <thuth@...hat.com>, Will Deacon <will@...nel.org>,
        Zenghui Yu <yuzenghui@...wei.com>, kvmarm@...ts.linux.dev,
        kvm@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, linuxppc-dev@...ts.ozlabs.org,
        linux-trace-kernel@...r.kernel.org, x86@...nel.org,
        linux-mm@...gle.com
Subject: Re: kvm/arm64: Spark benchmark

On Fri, Jun 9, 2023 at 7:04 AM Marc Zyngier <maz@...nel.org> wrote:
>
> On Fri, 09 Jun 2023 01:59:35 +0100,
> Yu Zhao <yuzhao@...gle.com> wrote:
> >
> > TLDR
> > ====
> > Apache Spark spent 12% less time sorting four billion random integers twenty times (in ~4 hours) after this patchset [1].
>
> Why are the 3 architectures you have considered being evaluated with 3
> different benchmarks?

I was hoping people having special interests in different archs might
try to reproduce the benchmarks that I didn't report (but did cover)
and see what happens.

> I am not suspecting you to have cherry-picked
> the best results

I'm generally very conservative when reporting *synthetic* results.
For example, the same memcached benchmark used on powerpc yielded >50%
improvement on aarch64, because the default Ubuntu Kconfig uses 64KB
base page size for powerpc but 4KB for aarch64. (Before the series,
the reclaim (swap) path takes kvm->mmu_lock for *write* on O(nr of all
pages to consider); after the series, it becomes O(actual nr of pages
to swap), which is <10% given how the benchmark was set up.)

          Ops/sec  Avg. Latency  p50 Latency  p99 Latency  p99.9 Latency
------------------------------------------------------------------------
Before  639511.40       0.09940      0.04700      0.27100       22.52700
After   974184.60       0.06471      0.04700      0.15900        3.75900

> but I'd really like to see a variety of benchmarks
> that exercise this stuff differently.

I'd be happy to try other synthetic workloads that people think that
are relatively representative. Also, I've backported the series and
started an A/B experiment involving ~1 million devices (real-world
workloads). We should have the preliminary results by the time I post
the next version.