linux-kernel - Re: Folios give an 80% performance win

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4ab2f8c4-38ce-3860-1465-e04dea4017b2@MichaelLarabel.com>
Date:   Sat, 24 Jul 2021 17:23:22 -0500
From:   Michael Larabel <Michael@...haelLarabel.com>
To:     Andres Freund <andres@...razel.de>,
        Matthew Wilcox <willy@...radead.org>
Cc:     James Bottomley <James.Bottomley@...senpartnership.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        linux-fsdevel@...r.kernel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "Darrick J. Wong" <djwong@...nel.org>,
        Christoph Hellwig <hch@....de>
Subject: Re: Folios give an 80% performance win

On 7/24/21 4:44 PM, Andres Freund wrote:
> Hi,
>
> On 2021-07-24 12:12:36 -0700, Andres Freund wrote:
>> On Sat, Jul 24, 2021, at 12:01, Matthew Wilcox wrote:
>>> On Sat, Jul 24, 2021 at 11:45:26AM -0700, Andres Freund wrote:
>>> It's always possible I just broke something.  The xfstests aren't
>>> exhaustive, and no regressions doesn't mean no problems.
>>>
>>> Can you guide Michael towards parameters for pgbench that might give
>>> an indication of performance on a more realistic workload that doesn't
>>> entirely fit in memory?
>> Fitting in memory isn't bad - that's a large post of real workloads. It just makes it hard to believe the performance improvement, given that we expect to be bound by disk sync speed...
> I just tried to compare folio-14 vs its baseline, testing commit 8096acd7442e
> against 480552d0322d. In a VM however (but at least with its memory being
> backed by huge pages and storage being passed through).  I got about 7%
> improvement with just some baseline tuning of postgres applied. I think a 1-2%
> of that is potentially runtime variance (I saw slightly different timings
> leading around checkpointing that lead to a bit "unfair" advantage to the
> folio run).
>
> That's a *nice* win!
>
> WRT the ~70% improvement:
>
>> Michael, where do I find more details about the codification used during the
>> run?
> After some digging I found https://github.com/phoronix-test-suite/phoronix-test-suite/blob/94562dd4a808637be526b639d220c7cd937e2aa1/ob-cache/test-profiles/pts/pgbench-1.10.1/install.sh
> For one the test says its done on ext4, while I used xfs. But I think the
> bigger thing is the following:

Yes that is the run/setup script used. The additional pgbench arguments 
passed at run-time are outlined in

https://github.com/phoronix-test-suite/phoronix-test-suite/blob/94562dd4a808637be526b639d220c7cd937e2aa1/ob-cache/test-profiles/pts/pgbench-1.10.1/test-definition.xml

Though in this case is quite straight-forward in corresponding to the 
relevant -s, -c options for pgbench and what is shown in turn on the 
pgbench graphs.

I have been running some more PostgreSQL tests on other hardware as well 
as via HammerDB and other databases. Will send that over when wrapped up 
likely tomorrow.

Michael


>
> The phoronix test uses postgres with only one relevant setting adjusted
> (increasing the max connection count). That will end up using a buffer pool of
> 128MB, no huge pages, and importantly is configured to aim for not more than
> 1GB for postgres' journal, which will lead to constant checkpointing. The test
> also only runs for 15 seconds, which likely isn't even enough to "warm up"
> (the creation of the data set here will take longer than the run).
>
> Given that the dataset phoronix is using is about ~16GB of data (excluding
> WAL), and uses 256 concurrent clients running full tilt, using that limited
> postgres settings doesn't end up measuring something particularly interesting
> in my opinion.
>
> Without changing the filesystem, using a configuration more similar to
> phoronix', I do get a bigger win. But the run-to-run variance is so high
> (largely due to the short test duration) that I don't trust those results
> much.
>
> It does look like there's a less slowdown due to checkpoints (i.e. fsyncing
> all data files postgres modified since the last checkpoints) on the folio
> branch, which does make some sense to me and would be a welcome improvement.
>
> Greetings,
>
> Andres Freund